Enhancements to NFS 王信富 R88725039 2000/11/6. Introduction File system modules File system...
-
Upload
alexina-walker -
Category
Documents
-
view
229 -
download
0
Transcript of Enhancements to NFS 王信富 R88725039 2000/11/6. Introduction File system modules File system...
Enhancements to NFSEnhancements to NFS
王信富 王信富 R88725039R88725039
2000/11/62000/11/6
IntroductionIntroduction
File system modulesFile system modules– Directory moduleDirectory module– File moduleFile module– Access control moduleAccess control module– File access moduleFile access module– Block moduleBlock module
IntroductionIntroduction
Distributed file system requirementsDistributed file system requirements– TransparencyTransparency
» Access transparencyAccess transparency
» location transparencylocation transparency
» Scaling transparencyScaling transparency
– ConsistencyConsistency– SecuritySecurity
Sun NFS architectureSun NFS architecture
Andrew file system architectureAndrew file system architecture
Mobility enhancement Mobility enhancement Mobile file system (MFS)Mobile file system (MFS)
Mobile file system (MFS)Mobile file system (MFS) Client modulesClient modules
Mobile file system (MFS)Mobile file system (MFS) Proxy modulesProxy modules
» Source: Maria-Teresa Segarra IRISA Research Institute Campus de Source: Maria-Teresa Segarra IRISA Research Institute Campus de BeaulieuBeaulieu
Mobility enhancementMobility enhancement cont.cont.
NFS/MNFS/M– Enable the mobile user to access the Enable the mobile user to access the
information regardless of information regardless of » the location of the userthe location of the user
» the state of the communication channelthe state of the communication channel
» the state of the data serverthe state of the data server
NFS/M architectureNFS/M architecture
NFS/M modulesNFS/M modules
NFS/M modulesNFS/M modules
Cache Manager (CM)Cache Manager (CM)– All the file system operations to any cached All the file system operations to any cached
objects in the local disk cache are managed by objects in the local disk cache are managed by the CMthe CM
– It functions only in the connected phaseIt functions only in the connected phase
NFS/M modulesNFS/M modules
Proxy Server (PS)Proxy Server (PS)– Emulates the functionalities of the remote NFS Emulates the functionalities of the remote NFS
server by using the cached file system objects server by using the cached file system objects in the local disk cachein the local disk cache
– It functions in the disconnected phaseIt functions in the disconnected phase
NFS/M modulesNFS/M modules
Reintegrator (RI)Reintegrator (RI)– Propagates the changes of the data objects in Propagates the changes of the data objects in
the local disk cache performed during the the local disk cache performed during the disconnected period back to the NFS serverdisconnected period back to the NFS server
– Three tasks for the RIThree tasks for the RI» Conflict detectionConflict detection
» Update propagationUpdate propagation
» Conflict resolutionsConflict resolutions
NFS/M modulesNFS/M modules
Data Prefetcher (DP)Data Prefetcher (DP)– Improving data access performanceImproving data access performance– Data prefetching techniques can be classified Data prefetching techniques can be classified
into two categoriesinto two categories» Informed prefetchingInformed prefetching
» Predictive prefectingPredictive prefecting
NFS/M modulesNFS/M modules
Phase of the NFS/MPhase of the NFS/M
NFS/M client maintains an internal state NFS/M client maintains an internal state which terms as phase, which is used to which terms as phase, which is used to indicate how file system service provided indicate how file system service provided under different conditions of network under different conditions of network connectivityconnectivity
Three phases:Three phases:– Connected phaseConnected phase– Disconnected phaseDisconnected phase– ReintegrationReintegration
Phase of the NFS/MPhase of the NFS/M
» John C.S. Lui , Oldfield K.Y. So , T.S. Tam, Department of John C.S. Lui , Oldfield K.Y. So , T.S. Tam, Department of Computer Science & EngineeringComputer Science & Engineering
Case : wireless andrewCase : wireless andrew
It builds on the university’s wired network It builds on the university’s wired network infrastructure,which currently provides infrastructure,which currently provides 10/100Mb/s Ethernet service10/100Mb/s Ethernet service
To supply high-speed wireless service to To supply high-speed wireless service to the campus, lucent WaveLAN equipments the campus, lucent WaveLAN equipments have installedhave installed
For wireless access off campus or otherwise For wireless access off campus or otherwise out of the range of the WaveLAN network, out of the range of the WaveLAN network, using cellular digital packet datausing cellular digital packet data
Case : wireless andrewCase : wireless andrew
» Wireless Andrew [mobile computing for university campus] 1999 IEEEWireless Andrew [mobile computing for university campus] 1999 IEEE
Reference URLReference URL
http://csep1http://csep1.phy.ornl.gov/nw.phy.ornl.gov/nw/node3.html/node3.html http://wwwhttp://www.dqc.dqc.org/.org/~chris/tcpip~chris/tcpip_ill_ill
/nfs_netw.htm/nfs_netw.htm http://www.codahttp://www.coda.cs.cmu.edu.cs.cmu.edu// http://wwwhttp://www.uwsg.iu.edu/usail.uwsg.iu.edu/usail/network/network/nfs/nfs
/overview.html/overview.html http://wwwhttp://www.netapp.netapp.com/tech_library.com/tech_library
/nfsbook/nfsbook.html.html
Scalibility about NFSScalibility about NFS
學生:朱漢農 學生:朱漢農 R88725032R88725032
2000/11/62000/11/6
NFS - ScalabilityNFS - Scalability AFS - ScalabilityAFS - Scalability NFS Enhancement -Spritely NFS, NQNFS, NFS Enhancement -Spritely NFS, NQNFS,
WebNFS, NFS Version4WebNFS, NFS Version4 AFS Enhancement - RAID, LSF, xFSAFS Enhancement - RAID, LSF, xFS FrangipaniFrangipani
NFS - NFS - ScalabilityScalability
The performance of a single server can be The performance of a single server can be increased by the addition of processors, increased by the addition of processors, disks and controllers.disks and controllers.
When the limits of that process are reached, When the limits of that process are reached, additional servers must be installed and the additional servers must be installed and the filesystems must be reallocated between filesystems must be reallocated between them.them.
NFS - NFS - Scalability (cont’d)Scalability (cont’d)
The effectiveness of that strategy is limited The effectiveness of that strategy is limited by the existence of ‘hot spot’ files.by the existence of ‘hot spot’ files.
When loads exceed the maximum When loads exceed the maximum performance, a distributed file system that performance, a distributed file system that supports supports replication of updatable filesreplication of updatable files, or , or one that one that reduces the protocol traffic by the reduces the protocol traffic by the caching of whole filescaching of whole files, may offer a better , may offer a better solution.solution.
AFS - AFS - ScalabilityScalability
The differences between AFS and NFS are The differences between AFS and NFS are attributable to the identification of attributable to the identification of scalability as the most important design scalability as the most important design goal.goal.
The key strategy is The key strategy is the caching of whole the caching of whole file in client nodesfile in client nodes..
AFS - AFS - Scalability (cont’d)Scalability (cont’d)
Whole-file serving: The entire contents of Whole-file serving: The entire contents of directories and files are transmitted to client directories and files are transmitted to client computers by AFS servers.computers by AFS servers.
Whole-file caching: Once a copy of a file Whole-file caching: Once a copy of a file has been transferred to a client computer, it has been transferred to a client computer, it is stored in a cache on the local disk.is stored in a cache on the local disk.
The cache is permanent, surviving reboots The cache is permanent, surviving reboots of the client computer.of the client computer.
NFS enhancement - NFS enhancement - Spritely NFSSpritely NFS
is an implementation of the NFS protocol with is an implementation of the NFS protocol with the addition of open and close calls.the addition of open and close calls.
The parameters of the Sprite The parameters of the Sprite openopen operation operation specify a mode and include counts of the specify a mode and include counts of the number of local processes that currently have number of local processes that currently have the file open for reading and for writing.the file open for reading and for writing.
Spritely NFS implements a recovery protocol Spritely NFS implements a recovery protocol that interrogates a list of clients to recover the that interrogates a list of clients to recover the full full open files tableopen files table..
NFS enhancement - NFS enhancement - NQNFSNQNFS
maintains similar client-related state maintains similar client-related state concerning open files, but it uses leases to concerning open files, but it uses leases to aid recovery after a server crash.aid recovery after a server crash.
Callbacks are used in a similar manner to Callbacks are used in a similar manner to Spritely NFS to request clients to flush their Spritely NFS to request clients to flush their caches when a write request occurs.caches when a write request occurs.
NFS enhancement - NFS enhancement - WebNFSWebNFS
makes it possible for application programs makes it possible for application programs to become clients of NFS servers anywhere to become clients of NFS servers anywhere in the Internet (using the NFS protocol in the Internet (using the NFS protocol directly)directly)
implementing Internet applcations that implementing Internet applcations that share data directly, such as multi-user share data directly, such as multi-user games or clients of large dynamics games or clients of large dynamics databases.databases.
NFS enhancement - NFS enhancement - NFS version 4NFS version 4
will include the features of WebNFSwill include the features of WebNFS the use of callback or leases to maintain the use of callback or leases to maintain
consistencyconsistency on-the-fly recoveryon-the-fly recovery Scalability will be improved by using proxy Scalability will be improved by using proxy
servers in a manner analogous to their use servers in a manner analogous to their use in the Web.in the Web.
AFS enhancementAFS enhancement
RAIDRAID Log-structured file storageLog-structured file storage xFSxFS
– implements a software RAID storage system, implements a software RAID storage system, striping file data across disks on multiple striping file data across disks on multiple computers together with a log-structuring computers together with a log-structuring technique.technique.
FrangipaniFrangipani
A highly scalable distributed file system A highly scalable distributed file system developed and deployed at the Digital developed and deployed at the Digital Systems Research Center.Systems Research Center.
Frangipani Frangipani (cont’d)(cont’d)
The responsibility for managing files and The responsibility for managing files and associated tasks is assigned to hosts associated tasks is assigned to hosts dynamically.dynamically.
All machines see a unified file name space All machines see a unified file name space with coherent access to shared updatable with coherent access to shared updatable files.files.
Frangipani - Frangipani - System StructureSystem Structure
Two totally independent layers -Two totally independent layers -1. Petal distributed virtual disk system1. Petal distributed virtual disk system
- Data is stored in a log-structured and striped format - Data is stored in a log-structured and striped format in the virtual disk store.in the virtual disk store.
- Providing a storage repository- Providing a storage repository
- Providing highly available storage that can scale in - Providing highly available storage that can scale in throughput and capacity as resources are added to itthroughput and capacity as resources are added to it
- Petal implements data replication for high - Petal implements data replication for high availability, obviating the need for Frangipani to do availability, obviating the need for Frangipani to do so.so.
Frangipani - Frangipani - System Structure System Structure (cont’d)(cont’d)
2. 2. Frangipani server modules.Frangipani server modules.- Providing names, directories, and files- Providing names, directories, and files
- Providing a file system layer that makes Petal useful - Providing a file system layer that makes Petal useful to applications while retaining and extending its to applications while retaining and extending its good properties.good properties.
FrangipaniFrangipani
FrangipaniFrangipani
FrangipaniFrangipani
FrangipaniFrangipani
Frangipani - Frangipani - Logging and RecoveryLogging and Recovery
uses write-ahead redo logging of metadata uses write-ahead redo logging of metadata to simplify failure recovery and improve to simplify failure recovery and improve performance.performance.
User data is not logged.User data is not logged. Each Frangipani has its own private log in Each Frangipani has its own private log in
Petal.Petal. As long as the underlying Petal volume As long as the underlying Petal volume
remains available, the system tolerates an remains available, the system tolerates an unlimited number of Frangipani failures.unlimited number of Frangipani failures.
Frangipani - Frangipani - Logging and RecoveryLogging and Recovery
Frangipani’s locking protocol ensures that Frangipani’s locking protocol ensures that updates requested to the same data by updates requested to the same data by different servers are serialized.different servers are serialized.
Frangipani ensures that recovery applies Frangipani ensures that recovery applies only updates that were logged since the only updates that were logged since the server acquired the locks that cover them, server acquired the locks that cover them, and for which it still holds the locks.and for which it still holds the locks.
Frangipani - Frangipani - Logging and RecoveryLogging and Recovery
Recovery never replays a log describing an Recovery never replays a log describing an update that has already been completed.update that has already been completed.
For each block that a log record updates, the For each block that a log record updates, the record contains a description of the changes record contains a description of the changes and the new version number.and the new version number.
During recovery, the changes to a block are During recovery, the changes to a block are applied only if the block version number is applied only if the block version number is less than the record version number.less than the record version number.
Frangipani - Frangipani - Logging and RecoveryLogging and Recovery
Frangipani reuses freed metadata blocks Frangipani reuses freed metadata blocks only to hold new metadata.only to hold new metadata.
At any time, only one recovery demon is At any time, only one recovery demon is trying to replay the log region of a specific trying to replay the log region of a specific server.server.
If a sector is damaged such that reading it If a sector is damaged such that reading it returns a CRC error, Petal’s built-in returns a CRC error, Petal’s built-in replication can recover it.replication can recover it.
Frangipani - Frangipani - Logging and RecoveryLogging and Recovery
In both local UNIX and Frangipani, a user In both local UNIX and Frangipani, a user can get better consistency semantics by can get better consistency semantics by calling calling fsync fsync at suitable checkpoint.at suitable checkpoint.
Frangipani - Frangipani - Synchronization and Synchronization and Cache CoherenceCache Coherence
Frangipani uses multiple-reader/single-Frangipani uses multiple-reader/single-writer locks to implement the necessary writer locks to implement the necessary synchronization.synchronization.
When the lock service detects conflicting When the lock service detects conflicting lock requests, the current holder of the lock lock requests, the current holder of the lock is asked to release or downgrade it to is asked to release or downgrade it to remove the conflict.remove the conflict.
Frangipani - Frangipani - Synchronization and Synchronization and Cache CoherenceCache Coherence
When a Frangipani crashes, the locks that it When a Frangipani crashes, the locks that it owns cannot be released until appropriate owns cannot be released until appropriate recovery actions have been performed.recovery actions have been performed.
When a Frangipani’s lease expires, the lock When a Frangipani’s lease expires, the lock service will ask the clerk on another service will ask the clerk on another machine to perform recovery and release all machine to perform recovery and release all locks belonging to the crashed Frangipani.locks belonging to the crashed Frangipani.
Frangipani - Frangipani - Synchronization and Synchronization and Cache CoherenceCache Coherence
Petal can continue operation in the face of Petal can continue operation in the face of network partitions, as long as a majority of network partitions, as long as a majority of the Petal remain up and in communication.the Petal remain up and in communication.
The lock service continues operation as The lock service continues operation as long as a majority of lock servers are up and long as a majority of lock servers are up and in communication.in communication.
Frangipani - Frangipani - Synchronization and Synchronization and Cache CoherenceCache Coherence
If a Frangipani server is partitioned away If a Frangipani server is partitioned away from the lock service, it will be unable to from the lock service, it will be unable to renew its lease.renew its lease.
If a Frangipani server is partitioned away If a Frangipani server is partitioned away from Petal, it will be unable to read or write from Petal, it will be unable to read or write the virtual disk.the virtual disk.
Frangipani - Frangipani - Adding ServersAdding Servers
The new server need be told which Petal The new server need be told which Petal virtual disk to use and where to find the virtual disk to use and where to find the lock service.lock service.
The new server contacts the lock service to The new server contacts the lock service to obtain a lease, determines which portion of obtain a lease, determines which portion of the log space to use from the lease the log space to use from the lease identifier.identifier.
Frangipani - Frangipani - Removing ServersRemoving Servers
Simply shut the server off.Simply shut the server off. Preferable for the server to flush all its dirty Preferable for the server to flush all its dirty
data and release its locks before halting, but data and release its locks before halting, but not strictly be needednot strictly be needed
Frangipani - Frangipani - Servers halts abruptlyServers halts abruptly
Recovery will run on its log the next time Recovery will run on its log the next time one of its locks is needed, birnging the one of its locks is needed, birnging the shared disk into a consistent state.shared disk into a consistent state.
Petal can also be added and removed Petal can also be added and removed transparently. Lock servers are added and transparently. Lock servers are added and removed in a similar manner.removed in a similar manner.
Frangipani - Frangipani - ScalingScaling
Operational latencies are unchanged and Operational latencies are unchanged and throughput scales linearly as servers are throughput scales linearly as servers are added.added.
Frangipani - Frangipani - ScalingScaling
Frangipani - Frangipani - ScalingScaling
Frangipani - Frangipani - ScalingScaling
Frangipani - Frangipani - ScalingScaling
The performance is seen to scale well The performance is seen to scale well because of no contention until the ATM because of no contention until the ATM links to the Petal are saturated.links to the Petal are saturated.
Since the virtual disk is replicated, each Since the virtual disk is replicated, each write from a Frangipani server turns into write from a Frangipani server turns into two writes to the Petal.two writes to the Petal.
Frangipani - Frangipani - ConclusionsConclusions
Providing its users with coherent, shared Providing its users with coherent, shared access to the same set of files, yet is access to the same set of files, yet is scalable to providing more storage space, scalable to providing more storage space, higher performance, and load balancinghigher performance, and load balancing
It was feasible to build because of its two-It was feasible to build because of its two-layer structure, consisting of multiple file layer structure, consisting of multiple file servers running the same file system code servers running the same file system code on top of a shared Petal.on top of a shared Petal.
Reference SourceReference Source
Timothy Mann and Edward K. Lee,. Timothy Mann and Edward K. Lee,. Frangipani: A Scalable Distributed File Frangipani: A Scalable Distributed File SystemSystem