Andrew McNab - Manchester HEP - 29 January 2002 SlashGrid (“/grid”) Motivation: dynamic-accounts...
-
Upload
bernard-stokes -
Category
Documents
-
view
212 -
download
0
Transcript of Andrew McNab - Manchester HEP - 29 January 2002 SlashGrid (“/grid”) Motivation: dynamic-accounts...
Andrew McNab - Manchester HEP - 29 January 2002
SlashGrid (“/grid”)
• Motivation: dynamic-accounts issues• Local storage: implementation alternatives• Generalisation: remote file access• Implementation: Coda, ACLs, plugins• Current status• Future work
a framework for Grid-aware filesystems
Andrew McNab - Manchester HEP - 29 January 2002
Motivation: dynamic accounts• For TB1 we provided a patch for Globus gatekeeper,
gsi_wuftpd etc to associate Unix UIDs from a pool with the Grid DN identities of incoming requests.
• This is ok when all jobs do on the machine in question is computation.
• But (1) any files created by pool UID need to cleaned up before account can be reallocated.
• But (2) no good for long term storage, since no promise to maintain UID-DN association in long term.
• But (3) what if malicious user creates a cron entry, writes to some obscure writeable directory we didn’t think of, etc?
Andrew McNab - Manchester HEP - 29 January 2002
Solution: get away from UID filesystems
• All these problems are fundamentally because files are owned according to UID, but we want UID to have no long term meaning.
• Obvious solution is to have a filesystem where file ownership depends on Grid DNs not temporary UIDs.– Can then ban user processes from writing anywhere else
(straightforward to impose this with a modified ext2 device driver: eg no disk files can be created if UID > 99)
• UID becomes as transitory as Process Group ID.
• Problem now becomes: how to implement a DN/Grid aware filesystem?
Andrew McNab - Manchester HEP - 29 January 2002
Implementation alternatives
• 1) Fake a filesystem by making user process use modified versions of open(), read() etc system calls.– Can do this by relinking, or by an interposition / bypass library that
is preloaded before real, shared libc.
– But, this cannot enforce access restrictions on files accessible on local disk (since you can use a static binary and ignore permissions)
– Need to put filesystem behind a server, accessed via TCP ports, named pipe, or shared memory (all the usual X tricks.) This going to be slow for streaming large files: the very thing we need to be fast.
• 2) Put filesystem into kernel– Lets kernel enforce access control. Potentially as fast as normal disk.
– User space daemon useful to parse proxies, and do any remote IO.
Andrew McNab - Manchester HEP - 29 January 2002
Coda• A suitable kernel module already exists for Linux: Coda
– introduced into main kernel tree in 1997 (during 2.1) and present in all 2.2 and 2.4 stable kernels.
• This is part of the Coda project at CMU, an open source fork of AFS2.
• Very similar architecture to AFS– Kernel module and client side cache daemon (Venus)
– Kerberos based
• Already used “parasitically” by other Linux projects– eg AVFS maps files to virtual filesystems (eg cd into a tar file…)
• Coda kernel module / Venus also available for *BSD and Windows 98/NT upwards.
Andrew McNab - Manchester HEP - 29 January 2002
Implementation with Coda• Coda kernel module talks to client cache daemon by
exchanging messages via /dev/cfs0
• Since we already have the kernel module, we just need to write a Venus-like daemon: SlashGrid (“/grid”)
• Coda implementation allows efficient streaming:– open(), close(), stat() handled by calls to Venus/SlashGrid daemon
– coda_open call returns the inode of the cached copy to the kernel
– subsequent read() and write() operations handled by kernel itself, without daemon being involved.
– So streaming a local copy is just as fast as reading/writing a normal disk file.
• Since SlashGrid called for open()’s etc, can enforce DN based access control at that point.
Andrew McNab - Manchester HEP - 29 January 2002
System calls with SlashGrid
kernel
a real (ext2) disk
open()read()stat()
SlashGrid read()write()
open()stat()
/dev/cfs0/var/spool/slashgrid/fcache
ordinary directory /grid/...
Standard Unix
User processUser process
Andrew McNab - Manchester HEP - 29 January 2002
Remote file access• Another idea that has been around a while: AFS-like system
using Grid protocols.
• All the usual advantages of a global filesystem– Makes a lot of the tedious management of “parameter” files needed
by jobs just another operating system service.
– Very useful for interactive users: they just see the Grid as one big file system.
– Makes all applications (even ls) Grid-enabled immediately.
• Already using URLs to refer to remote files, so easy to find an appropriate mapping into a filesystem space.
• So we want to design a system that can be generalised to remote file access too.
Andrew McNab - Manchester HEP - 29 January 2002
ACL format• Need to specify permissions in some way.
• Commonly used compromise between granularity and simplicity is the per-directory ACL (cf AFS)
• We’ve used the same format as the GridSite website management system (used for WP6 and GridPP websites):– admin: can modify ACL
– write: can write/create files
– list: can get a directory listing
– read: can read a named file
– ACL consists of lines: <level> <DN/group>
• Currently only implement <DN> but in future will add VO groups, CAS authorisation symbols etc (when dust settles...)
Andrew McNab - Manchester HEP - 29 January 2002
ACL implementation• Each directory has, or appears to have, a read-only file .grid-
acl consisting of ACL lines in <level> <DN/group> format.• Can easily be transferred via existing protocols
– eg if cache daemon fetches a file from a remote gsi-ftp server, can fetch the .grid-acl from the same directory without modifying gsi_wuftpd or GridFTP protocol.
• Modification of ACL done by accessing “virtual files” - these operations are trapped by SlashGrid and ACL updated– cf. Coda’s .CONTROL mechanism– eg remove file .grid-acl-write-%%url-encoded-DN%% to change the
DN’s permission level to write
• Provide command line tools to hide this from users
Andrew McNab - Manchester HEP - 29 January 2002
Plugin framework• Avoid making a monolithic system since:
– Lots of interesting filesystems possible: anon ftp, http, https, gsi-ftp, rfio, ldap, SQL databases (cf. Oracle 8i) …
– Lots of uncertainty about which caching strategies to use.
– Some people will want some but not all of this on their systems.
• Have /etc/slashgrid.conf that specifies mount points and then which loadable module handles which part of the file system (cf. /etc/fstab)
• At start time, load dynamic modules which all export a common API.
• SlashGrid daemon hands each request to the right plugin– user: stat() => coda_getattr => PluginStat() => plugin: stat()
Andrew McNab - Manchester HEP - 29 January 2002
Example configuration
/etc/slashgrid.conf[/]plugin=certfs.so[/gsiftp]plugin=gsiftpfs.so
/grid - mount point for Coda kernel module fs/var/spool/slashgrid/fcache/ => /grid//var/spool/slashgrid/fcache/tmp/ => /grid/tmp//var/spool/slashgrid/fcache/gsiftp/ => /grid/gsiftp//usr/lib/slashgrid/plugins/certfs.so, gsiftpfs.so ...
Andrew McNab - Manchester HEP - 29 January 2002
Remote file access strategies• SlashGrid framework allows several options: none “the best”
• simplest: make a local copy when the coda_open call is received, and return the copy’s inode when transfer finishes
– ok for small files
– awful for very big files: need lots of disk cache and have to wait
• pure streaming: plugin forks a process to stream the file from remote server; makes a temporary named pipe and returns its inode to kernel; writes incoming file to pipe; kernel (and therefore user) read file as it comes in; tidy up pipe when coda_close received.
– good when we have a copy on a “close” file server (cf. NFS)
• both: stream file down a named pipe, but keep a copy too.
• Writing even more complicated: when to transfer local write-cache?– do we need consistency for different machines viewing the same server?
Andrew McNab - Manchester HEP - 29 January 2002
Current status• Have implemented SlashGrid daemon and one plugin to
provide local file storage with ACLs (certfs.so)
• SlashGrid obtains DN of a UID from /tmp/x509up_uUID– so you do grid-proxy-init to get started
• stat / read / creat / mkdir / write / remove / rename / chmod system calls working for files and directories
• can already do normal shell commands (ls etc), edit files with emacs, even copy the SlashGrid and certfs sources into the filesystem and build them with make and gcc.
• some things not yet done– hard and soft links (means I can’t try building a Linux kernel yet…)
– modifying ACL’s - have to be set manually as root still
Andrew McNab - Manchester HEP - 29 January 2002
Future work
• Finish certfs and ACL tools
• Implement an example remote IO plugin– probably anonymous ftp since simplest
• Document the plugin API– Encourage other people to write plugins for things they need.
• Write plugins for the major protocols: gsi-ftp and https
• Investigate specialised filesystems for dynamic accounts, automated cleanup, extra logging / auditing, ...
• Look at porting to other OS’s:– Coda kernel module exists for *BSD and Windows already
– The Linux Coda module was only 4000 lines of C...
Andrew McNab - Manchester HEP - 29 January 2002
Conclusion
• Have implemented a read/write filesystem for Linux, based on Grid DNs rather than Unix UIDs.
• Have done this in an extendable way using plugins for different filesystem types.
• Should be straightforward to write a plugin for your favourite remote file access protocol.
• System is efficient for streaming local copies of files– But can still accommodate many different strategies for
fetching, caching and streaming files from remote servers.
• (Thanks to Anders, Cal and Fabio of Integration Team for useful discussions about all these issues.)
Andrew McNab - Manchester HEP - 29 January 2002
More information...
• [email protected]– (now)
• http://www.gridpp.ac.uk/slashgrid/– (later today)
• WP6 CVS repository– (later this week)