Andrew McNab - Manchester HEP - 29 January 2002 SlashGrid (“/grid”) Motivation: dynamic-accounts...

17
Andrew McNab - Manchester HEP - 29 January 2002 SlashGrid (“/grid”) Motivation: dynamic-accounts issues Local storage: implementation alternatives Generalisation: remote file access Implementation: Coda, ACLs, plugins Current status Future work a framework for Grid-aware filesystems

Transcript of Andrew McNab - Manchester HEP - 29 January 2002 SlashGrid (“/grid”) Motivation: dynamic-accounts...

Page 1: Andrew McNab - Manchester HEP - 29 January 2002 SlashGrid (“/grid”) Motivation: dynamic-accounts issues Local storage: implementation alternatives Generalisation:

Andrew McNab - Manchester HEP - 29 January 2002

SlashGrid (“/grid”)

• Motivation: dynamic-accounts issues• Local storage: implementation alternatives• Generalisation: remote file access• Implementation: Coda, ACLs, plugins• Current status• Future work

a framework for Grid-aware filesystems

Page 2: Andrew McNab - Manchester HEP - 29 January 2002 SlashGrid (“/grid”) Motivation: dynamic-accounts issues Local storage: implementation alternatives Generalisation:

Andrew McNab - Manchester HEP - 29 January 2002

Motivation: dynamic accounts• For TB1 we provided a patch for Globus gatekeeper,

gsi_wuftpd etc to associate Unix UIDs from a pool with the Grid DN identities of incoming requests.

• This is ok when all jobs do on the machine in question is computation.

• But (1) any files created by pool UID need to cleaned up before account can be reallocated.

• But (2) no good for long term storage, since no promise to maintain UID-DN association in long term.

• But (3) what if malicious user creates a cron entry, writes to some obscure writeable directory we didn’t think of, etc?

Page 3: Andrew McNab - Manchester HEP - 29 January 2002 SlashGrid (“/grid”) Motivation: dynamic-accounts issues Local storage: implementation alternatives Generalisation:

Andrew McNab - Manchester HEP - 29 January 2002

Solution: get away from UID filesystems

• All these problems are fundamentally because files are owned according to UID, but we want UID to have no long term meaning.

• Obvious solution is to have a filesystem where file ownership depends on Grid DNs not temporary UIDs.– Can then ban user processes from writing anywhere else

(straightforward to impose this with a modified ext2 device driver: eg no disk files can be created if UID > 99)

• UID becomes as transitory as Process Group ID.

• Problem now becomes: how to implement a DN/Grid aware filesystem?

Page 4: Andrew McNab - Manchester HEP - 29 January 2002 SlashGrid (“/grid”) Motivation: dynamic-accounts issues Local storage: implementation alternatives Generalisation:

Andrew McNab - Manchester HEP - 29 January 2002

Implementation alternatives

• 1) Fake a filesystem by making user process use modified versions of open(), read() etc system calls.– Can do this by relinking, or by an interposition / bypass library that

is preloaded before real, shared libc.

– But, this cannot enforce access restrictions on files accessible on local disk (since you can use a static binary and ignore permissions)

– Need to put filesystem behind a server, accessed via TCP ports, named pipe, or shared memory (all the usual X tricks.) This going to be slow for streaming large files: the very thing we need to be fast.

• 2) Put filesystem into kernel– Lets kernel enforce access control. Potentially as fast as normal disk.

– User space daemon useful to parse proxies, and do any remote IO.

Page 5: Andrew McNab - Manchester HEP - 29 January 2002 SlashGrid (“/grid”) Motivation: dynamic-accounts issues Local storage: implementation alternatives Generalisation:

Andrew McNab - Manchester HEP - 29 January 2002

Coda• A suitable kernel module already exists for Linux: Coda

– introduced into main kernel tree in 1997 (during 2.1) and present in all 2.2 and 2.4 stable kernels.

• This is part of the Coda project at CMU, an open source fork of AFS2.

• Very similar architecture to AFS– Kernel module and client side cache daemon (Venus)

– Kerberos based

• Already used “parasitically” by other Linux projects– eg AVFS maps files to virtual filesystems (eg cd into a tar file…)

• Coda kernel module / Venus also available for *BSD and Windows 98/NT upwards.

Page 6: Andrew McNab - Manchester HEP - 29 January 2002 SlashGrid (“/grid”) Motivation: dynamic-accounts issues Local storage: implementation alternatives Generalisation:

Andrew McNab - Manchester HEP - 29 January 2002

Implementation with Coda• Coda kernel module talks to client cache daemon by

exchanging messages via /dev/cfs0

• Since we already have the kernel module, we just need to write a Venus-like daemon: SlashGrid (“/grid”)

• Coda implementation allows efficient streaming:– open(), close(), stat() handled by calls to Venus/SlashGrid daemon

– coda_open call returns the inode of the cached copy to the kernel

– subsequent read() and write() operations handled by kernel itself, without daemon being involved.

– So streaming a local copy is just as fast as reading/writing a normal disk file.

• Since SlashGrid called for open()’s etc, can enforce DN based access control at that point.

Page 7: Andrew McNab - Manchester HEP - 29 January 2002 SlashGrid (“/grid”) Motivation: dynamic-accounts issues Local storage: implementation alternatives Generalisation:

Andrew McNab - Manchester HEP - 29 January 2002

System calls with SlashGrid

kernel

a real (ext2) disk

open()read()stat()

SlashGrid read()write()

open()stat()

/dev/cfs0/var/spool/slashgrid/fcache

ordinary directory /grid/...

Standard Unix

User processUser process

Page 8: Andrew McNab - Manchester HEP - 29 January 2002 SlashGrid (“/grid”) Motivation: dynamic-accounts issues Local storage: implementation alternatives Generalisation:

Andrew McNab - Manchester HEP - 29 January 2002

Remote file access• Another idea that has been around a while: AFS-like system

using Grid protocols.

• All the usual advantages of a global filesystem– Makes a lot of the tedious management of “parameter” files needed

by jobs just another operating system service.

– Very useful for interactive users: they just see the Grid as one big file system.

– Makes all applications (even ls) Grid-enabled immediately.

• Already using URLs to refer to remote files, so easy to find an appropriate mapping into a filesystem space.

• So we want to design a system that can be generalised to remote file access too.

Page 9: Andrew McNab - Manchester HEP - 29 January 2002 SlashGrid (“/grid”) Motivation: dynamic-accounts issues Local storage: implementation alternatives Generalisation:

Andrew McNab - Manchester HEP - 29 January 2002

ACL format• Need to specify permissions in some way.

• Commonly used compromise between granularity and simplicity is the per-directory ACL (cf AFS)

• We’ve used the same format as the GridSite website management system (used for WP6 and GridPP websites):– admin: can modify ACL

– write: can write/create files

– list: can get a directory listing

– read: can read a named file

– ACL consists of lines: <level> <DN/group>

• Currently only implement <DN> but in future will add VO groups, CAS authorisation symbols etc (when dust settles...)

Page 10: Andrew McNab - Manchester HEP - 29 January 2002 SlashGrid (“/grid”) Motivation: dynamic-accounts issues Local storage: implementation alternatives Generalisation:

Andrew McNab - Manchester HEP - 29 January 2002

ACL implementation• Each directory has, or appears to have, a read-only file .grid-

acl consisting of ACL lines in <level> <DN/group> format.• Can easily be transferred via existing protocols

– eg if cache daemon fetches a file from a remote gsi-ftp server, can fetch the .grid-acl from the same directory without modifying gsi_wuftpd or GridFTP protocol.

• Modification of ACL done by accessing “virtual files” - these operations are trapped by SlashGrid and ACL updated– cf. Coda’s .CONTROL mechanism– eg remove file .grid-acl-write-%%url-encoded-DN%% to change the

DN’s permission level to write

• Provide command line tools to hide this from users

Page 11: Andrew McNab - Manchester HEP - 29 January 2002 SlashGrid (“/grid”) Motivation: dynamic-accounts issues Local storage: implementation alternatives Generalisation:

Andrew McNab - Manchester HEP - 29 January 2002

Plugin framework• Avoid making a monolithic system since:

– Lots of interesting filesystems possible: anon ftp, http, https, gsi-ftp, rfio, ldap, SQL databases (cf. Oracle 8i) …

– Lots of uncertainty about which caching strategies to use.

– Some people will want some but not all of this on their systems.

• Have /etc/slashgrid.conf that specifies mount points and then which loadable module handles which part of the file system (cf. /etc/fstab)

• At start time, load dynamic modules which all export a common API.

• SlashGrid daemon hands each request to the right plugin– user: stat() => coda_getattr => PluginStat() => plugin: stat()

Page 12: Andrew McNab - Manchester HEP - 29 January 2002 SlashGrid (“/grid”) Motivation: dynamic-accounts issues Local storage: implementation alternatives Generalisation:

Andrew McNab - Manchester HEP - 29 January 2002

Example configuration

/etc/slashgrid.conf[/]plugin=certfs.so[/gsiftp]plugin=gsiftpfs.so

/grid - mount point for Coda kernel module fs/var/spool/slashgrid/fcache/ => /grid//var/spool/slashgrid/fcache/tmp/ => /grid/tmp//var/spool/slashgrid/fcache/gsiftp/ => /grid/gsiftp//usr/lib/slashgrid/plugins/certfs.so, gsiftpfs.so ...

Page 13: Andrew McNab - Manchester HEP - 29 January 2002 SlashGrid (“/grid”) Motivation: dynamic-accounts issues Local storage: implementation alternatives Generalisation:

Andrew McNab - Manchester HEP - 29 January 2002

Remote file access strategies• SlashGrid framework allows several options: none “the best”

• simplest: make a local copy when the coda_open call is received, and return the copy’s inode when transfer finishes

– ok for small files

– awful for very big files: need lots of disk cache and have to wait

• pure streaming: plugin forks a process to stream the file from remote server; makes a temporary named pipe and returns its inode to kernel; writes incoming file to pipe; kernel (and therefore user) read file as it comes in; tidy up pipe when coda_close received.

– good when we have a copy on a “close” file server (cf. NFS)

• both: stream file down a named pipe, but keep a copy too.

• Writing even more complicated: when to transfer local write-cache?– do we need consistency for different machines viewing the same server?

Page 14: Andrew McNab - Manchester HEP - 29 January 2002 SlashGrid (“/grid”) Motivation: dynamic-accounts issues Local storage: implementation alternatives Generalisation:

Andrew McNab - Manchester HEP - 29 January 2002

Current status• Have implemented SlashGrid daemon and one plugin to

provide local file storage with ACLs (certfs.so)

• SlashGrid obtains DN of a UID from /tmp/x509up_uUID– so you do grid-proxy-init to get started

• stat / read / creat / mkdir / write / remove / rename / chmod system calls working for files and directories

• can already do normal shell commands (ls etc), edit files with emacs, even copy the SlashGrid and certfs sources into the filesystem and build them with make and gcc.

• some things not yet done– hard and soft links (means I can’t try building a Linux kernel yet…)

– modifying ACL’s - have to be set manually as root still

Page 15: Andrew McNab - Manchester HEP - 29 January 2002 SlashGrid (“/grid”) Motivation: dynamic-accounts issues Local storage: implementation alternatives Generalisation:

Andrew McNab - Manchester HEP - 29 January 2002

Future work

• Finish certfs and ACL tools

• Implement an example remote IO plugin– probably anonymous ftp since simplest

• Document the plugin API– Encourage other people to write plugins for things they need.

• Write plugins for the major protocols: gsi-ftp and https

• Investigate specialised filesystems for dynamic accounts, automated cleanup, extra logging / auditing, ...

• Look at porting to other OS’s:– Coda kernel module exists for *BSD and Windows already

– The Linux Coda module was only 4000 lines of C...

Page 16: Andrew McNab - Manchester HEP - 29 January 2002 SlashGrid (“/grid”) Motivation: dynamic-accounts issues Local storage: implementation alternatives Generalisation:

Andrew McNab - Manchester HEP - 29 January 2002

Conclusion

• Have implemented a read/write filesystem for Linux, based on Grid DNs rather than Unix UIDs.

• Have done this in an extendable way using plugins for different filesystem types.

• Should be straightforward to write a plugin for your favourite remote file access protocol.

• System is efficient for streaming local copies of files– But can still accommodate many different strategies for

fetching, caching and streaming files from remote servers.

• (Thanks to Anders, Cal and Fabio of Integration Team for useful discussions about all these issues.)

Page 17: Andrew McNab - Manchester HEP - 29 January 2002 SlashGrid (“/grid”) Motivation: dynamic-accounts issues Local storage: implementation alternatives Generalisation:

Andrew McNab - Manchester HEP - 29 January 2002

More information...

[email protected]– (now)

• http://www.gridpp.ac.uk/slashgrid/– (later today)

• WP6 CVS repository– (later this week)