AFS case study
-
Upload
manfred-furuholmen -
Category
Technology
-
view
942 -
download
1
description
Transcript of AFS case study
AFS Case Study Manfred at zeropiu.it
EuroBsdCon November 2006
Pagina 2 Pagina 2
Agenda
• Overview • Basic Concepts • AFS Servers Type • Arla Overview
• Best practice • Planning • AFS Convention
• Case Study • Solution • Architecture • Result
Pagina 3 Pagina 3
Overview
Andrew File System is a distributed file system designed to : • handle terabytes of data • handle thousands of users • working in WAN environment
Pagina 4 Pagina 4
Brief history of a AFS
• 1983 Andrew Project started at Carnegie Mellon University (CMU) • 1987 Coda research work begun (based on AFS) • 1988 First use of AFS version 3 First use of AFS outside Carnegie
Mellon University • 1988 Institutional File System project at University of Michigan – • 1989 Transarc Corporation founded to commercialize AFS, • 1993 Arla project started at Kungliga Tekniska Högskolan • 1998 Transarc Corporation becomes wholly owned subsidiary of IBM • 2000 IBM releases OpenAFS as OpenSource (IBM License), • 2000 OpenAFS release version 1.0 based on Transarc 3.6 • 2001 OpenAFS release version 1.2 first release with better support
of new operating system and fix several memory leak • 2005 OpenAFS release version 1.4 with a lot of new feature • 2005 AFS was discontinued from IBM
Pagina 5 Pagina 5
Basic Concepts • Transparent Access and Uniform Namespace
• Cell • Partitions and Volumes • Mount Points
• Scalability • Client Caching • Replication
• Security • Authentication and secure communication • Authorization and flexible access control
• System Management • Single system interface • Delegation • Backup
Pagina 6 Pagina 6
Transparent Access and Uniform Namespace
• Cell
• Cell is collection of file servers and workstation • The directories under /afs are cells , unique tree • Fileserver contains volumes
• Volumes • Volumes are "containers" or sets of related files
and directories • Have size limit • 3 type rw,ro,backup
• Mount Point • Access to a volume is provided through a mount
point • A mount point looks and just like a static
directory
Pagina 7 Pagina 7
Scalability
• Cache Manager (Client Side)
• Maintain information about identities users • Retrieve data from fileserver • Keeps chunks of retrieved files on local disk (cache)
• Replication • Frequently accessed data can be replicated (read-only) on
several server • Cache Manager make use of replicate volumes first
Pagina 8 Pagina 8
Security
• Authentication • Kerberos IV native (kaserver) • External Kerberos V • Unique identity • Encryption communication on data transfer (crypt option)
• Authorization • Access control list with 7 types permissions • Groups definition by user
Pagina 9 Pagina 9
System Management • Single system interface
• Configuration changes can made from any client • Move volume in transparent way • On-line upgrade and extend system
• Delegation • Group delegation • Admin delegation
• Backup • Backup volume and file • Built in backup function • User direct access (backup mounted)
Pagina 10 Pagina 10
Example write operation client side 1 create file rpc 2 write chunks into cache
(interrupted by store_data RPC)
3 read from cache 4 transfer over network 5 write to /vicepXX
Pagina 11 Pagina 11
Example write operation server side 1 Create file 2 Check metadata, permission,
quota and return file path 3 write file into /vicepXX 4 Update meta data on server 5 Update db
Pagina 12 Pagina 12
AFS Servers Type • Fileserver machine
• file storage
• Database server machine • File and Volume localization • ACL and groups administration • Authentication provider
• Binary distribution • Master server for afs binary
(specific architecture)
• System control machine • Time server • AFS configuration master
Pagina 13 Pagina 13
AFS Server Process • Bosserver, system monitor
• Fileserver, serves file
• Volserver, serves volume data
• Vlserver, volume location server
• Kaserver, kerberos IV server
• Ptserver, protection server (group,acl)
• Buserver, backup server
• Upserver
• Update conf • Update binary
Pagina 14 Pagina 14
Weakness • File restriction
• Pipes • Devices files • Sockets • Unicode name
• AFS Lock • Only advisory locks (byte-range locking underway)
• ACL • Only on directory
• Volume • Read only • Manual sync
• Write on close • Date time operation
Pagina 15 Pagina 15
Arla • AFS client
alternative
• *BSD support
• Disconnected operation
Pagina 16 Pagina 16
Where used ?
• University • Cmu, Stanford,MIT,KTH(Sweden),
Chemitz(Germany), Roma3(italy),…
• Research Labs • SLAC, DESY,CERN(EUROPE),INFN(ITALY),…
• Companies: • Intel,Morgan Stanley,Pictage,..
Pagina 17 Pagina 17
Conventions and Best Practices • AFS file space layout
• Server planning • Volume naming and schemas • Volume replication
• Username schemas • Partition Filesystem • Backup planning • Security consideration • Client Cache tuning • AFS limitations
Pagina 18 Pagina 18
Cell Name
• Convention • Company Domain name • Company Kerberos Realm
• Cell name • Short name (Max size cell is 64 characters) • Cell name can contain only lowercase characters • Suitable for different operating system. (Do not include command shell metacharacters).
Pagina 19 Pagina 19
Server planning
• Fileserver • Ratio 200:1 client server (many site today have 1000:1) • Replica server location • Big machine vs small machine
• Database server • 3 machine for election algorithm (ubik) • Separate from Fileserver
• Update server • One system
• Binary distribution • One system per architecture
Pagina 20 Pagina 20
Volume naming and schemas
• Volume name limit • Read/write volume names can be up to 22
characters in length • The .readonly and .backup extensions are reserved
word • root.afs and root.cell name are used for default
• Volume naming • Mount point prefix name (user.manfred) • Function suffix name
Pagina 21 Pagina 21
Volume layout and replication
• Volume • User have its own volume for simplify load balance operations (move,backup) • Volume for group of file (binary, documents ..)
• Replication is not appropriate for volumes that change frequently
• Replicate the root.afs and root.cell most as possible
• Backup volume use the same partition (it is a copy of the source volume's vnode index)
Pagina 22 Pagina 22
Username
• Username • Characters, which have special meanings to the command shell • The colon ( : ), because AFS reserves it as a field separator in protection group names; • The period ( . ); it is conventional used to identify special username that have administrator capability (ex. manfred.admin ) • AFS UID, 32766, is reserved for the user anonymous.
• UID maching, unix uid / AFS uid • Unix ldap • NIS • kerberos ldap backend • smb
Pagina 23 Pagina 23
Partition Filesystem (inode vs iname)
Inode faster • Dedicated partition • Special fsck for the system partition • No journaling file system • Restore on same filesystem layout (same inode structure)
Iname slower • OS fsck • Filesystem independent, with advantage of journaling • The aren’t special requirement for /VicepXX, it could be a mounted filesystem • Simply restore operation
Pagina 24 Pagina 24
Backup
• Native backup system and recovery
AFS can be configured to create a full or incremental backup
• Volume dump
This operation permit to create a binary file with all information of backup volume
• Backup system with AFS support
• Amanda • Bacula • Other commercial product
Pagina 25 Pagina 25
Security consideration
• User Accounts: • Kerberos integration with modified login utility • replace kaserver with Unix Kerberos solution or Windows AD (OpenAFS
support basic Kerberos 5 2b protocol) • including the unlog command in every user's .logout file or equivalent
• Server Machines • Change the AFS server encryption key on a frequent and regular schedule. • Particularly limit access to the local superuser root account on a server
machine.
• System Administrators • Create an administrative account for each administrator separate from the
personal account • assign AFS privileges only to the administrative account. • Set the token lifetime for administrative accounts to a fairly short amount
of time.
Pagina 26 Pagina 26
Client Cache
• Cache Size • single user machine 128MB • Multi-user machine 1GB/4GB
• Cache partition • Directory, the partition must grantee enough space • Disk partition, better performance (Terminal Server)
• Login Integration
Pagina 27 Pagina 27
AFS limitations
General Limit • OpenAFS can support a maximum of 104.000 clients per server • tmpfs no work as AFS Cache, (ramdisk work) • Max 255 partition per server (/vicepa-/vicepiv), no limits in partition size • Max 4,294,967,295 volumes per partition (this a limit of VLDB), • Max Volume size is 2TB • Max file limit per directory is 64,000 files (less than 16 characters).
Windows Limit • Write-on-close, the changes are synchronized only on close operation • No integration on Microsoft DFS • No support for files greater than 2GB on windows platform ( work in progress).
Pagina 28 Pagina 28
Case Study Italsempione
is nowadays the biggest Italian fully indipendent forwarding company covering any service related to transports and logistics with a worldwide agency network.
Company: • Head Quarter in Italy • 16 Branch Office in Italy • 7 branch outside Italy • 400 PC , Windows XX • 150 PC , Linux • 8 Windows NT Domain • Wide Area Network • No IT stuff on the branch office
Pagina 29 Pagina 29
Solution
• Primary goals • Reduce cost of Software License • Simplify System Administration task.
• Solution • Thin client replacement, terminal server • Server Virtualization , VMware • Storage Virtualization, OpenAFS
Pagina 30 Pagina 30
Architecture
Pagina 31 Pagina 31
Architecture
Head Quarter 3 Fileserver Machines
• User:Server rate 200:1. • The read-write information volumes are replicated with circular schema • The volumes of binary and programs are replicated on all fileserver. • The fileserver are based on OpenBSD 3.9.
3 Database Servers • installed on the same machine of fileserver
2 Authentication Servers • Heimdal Kerberos V • ldap backend (samba, heimdal, unix, profile info)
8 VMmachine
• windows terminal server image • Linux terminal server image • OpenBSD network service image
Pagina 32 Pagina 32
Architecture
Directory usage Volume name User home user.username User home backup user.username.backup Application apps.applicationname OS Software software.soname Groups groups.groupname VMware image image.osname
• Cell name= domain name • Main Directory tree = country/city/function • User Directory tree = usr/m/manfred • User volume
• Volume name prefix= mount point Suffix= function
• Volume replication Binary data Root volume(afs,cell)
Pagina 33 Pagina 33
Architecture
• Partition • inode base • Small partition for quick check • Odd vicepX for rw volume even for ro volume
• Backup • Bacula for incremental / total dump • User backup volume mounted in home dir
• Monitoring • Zabbix • AFS monitor and performance
Pagina 34 Pagina 34
Hardware
Fileserver /DbServer: • 1GB of RAM, • 3GHz Xeon single processor • 2x36Gb SCSI RAID 1 for operating system partition • 4x 143GB SCSI RAID5 storage (/vicepXX)
Authentication server:
• 1GB of RAM • 3GHz Xeon single processor • 2x36Gb SCSI RAID 1 for operating system and db backend
VMmachine:
• 4GB of RAM • 3GHz Xeon dual processor. • 2x36Gb SCSI RAID 1 for operating system and local vmware image.
Pagina 35 Pagina 35
Why OpenBSD • OpenAFS support
• Porting Server side and client side
• Security level
• Heimdal integration • AFS emulation • LDAP backend • 2ab protocol (large kerberos ticket)
• Small and fast
• Stable
Pagina 36 Pagina 36
Consideration
• Iron server vs Small server • small number of inexpensive fileservers (provides equivalent performance) • inexpensive incremental increase in capacity • better manageability and redundancy.
NFS file sharing vs AFS • AFS resulted in a 60% decrease in network traffic. • The server's load decreased by 80% • task execution time was reduced by 30%.
Pagina 37 Pagina 37
Benefit
• Reduced cost • Reduced software costs for 150.000 Euro
• Increase performance (Server and Desktop) • Reduced down time • Reduced helpdesk load
• Simplify System Administration task • Improved manageability • Full disaster recovery protection • Data accessible from Spain to Singapore with a • High security level • Single sign-on
Pagina 38 Pagina 38
Next
OpenAFS • Lock subsystem • Windows support • Kerberos V support
External project (www.beolink.org)
• Ptserver with ldap backend • Web interface
Pagina 39 Pagina 39
Reference
Install • bsdcan http://www.bsdcan.org/2006/papers/ • http://www.public.iastate.edu/~kula/talks/afs-bpw-2005/
afs-bpw-2005-iowa.html Openafs
• www.openafs.org • http://www.stacken.kth.se/project/arla/ • http://web.mit.edu/kerberos/www/ • http://www.pdc.kth.se/heimdal
Pagina 40 Pagina 40
The End