Sun QFS and Sun Storage Archive Manager (SAM) Release 5.0...
Transcript of Sun QFS and Sun Storage Archive Manager (SAM) Release 5.0...
Harriet CoverstonDistinguished EngineerSun Microsystems, Inc.June, 2009
Sun QFS and Sun Storage Archive Manager (SAM)Release 5.0 & Beyond
Sun QFS and Sun SAM
Page 2
Challenge: Cost Efficient Data Management and Fast Access to Large Volumes of Data● Budgets remain flat while data growth is
exploding> Increasing management costs
● Compliance requirements● Users require timely access to information
throughout its lifecycle
Sun QFS and Sun SAM
Page 3
Sun Storage SoftwareAdvanced Data Management Software
• Sun QFS – Shared File System> High performance parallel SAN file system> Native Linux Clients> http://www.sun.com/storage/management_software/data_management/qfs
• Sun Storage Archive Manager (SAM)> Policy-based automatic data migration (local & remote)> Tiered Storage (supports both disk & tape)> http://www.sun.com/storage/management_software/data_management/sam
Sun QFS and Sun SAM
Page 4
Open Source SoftwareBuild Community• Source is open!
http://opensolaris.org/os/project/samqfshttp://blogs.sun.com/samqfs/
• SAM APIs are open!> Allow users to manage data in SAM-QFS from within an
application programhttp://developers.sun.com/solaris/articles/libsam.html
• Discussion lists> Discussion list for general topics or issues
http://mail.opensolaris.org/mailman/listinfo/sam-qfs-discuss> Development alias for specific questions about source, source
contributions, or code reviewshttp://mail.opensolaris.org/mailman/listinfo/sam-qfs-dev
Sun QFS and Sun SAM
Page 5
Sun QFS - Shared File System• Large, existing, and loyal customer base
> Stable base, shipping since Aug 2002• Targets large enterprises, Web, and HPC
> Clients run on Solaris (SPARC, x64, & X86) & Linux> Metadata server runs on Solaris (SPARC & x64)> HA option with Solaris Cluster
• Optional WORM functionality for business compliance
• QFS Shared configuration supports 512 nodes
Sun QFS and Sun SAM
Page 6
QFS Shared File System Benefits• Data consolidation with SAN file sharing
> HBO – 5000 hours of programming to manage> “Provided the scalability to store and manage large files created
by program-length video with the performance necessary to meet HBO's demanding throughput goals”
http://www.sun.com/customers/storage/hbo.xml
• Performance and scalability> Near raw I/O performance for streaming I/O and
transactional I/O> File system I/O performance scales linearly with the hardware
• Parallel processing w/ multi-node read/write access• Built in automatic & continuous data protection w/SAM
Sun QFS and Sun SAM
Page 7
QFS Certified with Solaris Cluster• Solaris Cluster HA failover support
> Standalone QFS> HA-NFS over QFS> HA-SAM
• Solaris Cluster Advanced Edition for Oracle RAC relies on Shared QFS with Solaris Cluster for HA> Oracle certified on 9i, 10g, and 11g
• Oracle Billing and Revenue Management uses sQFS> “The best known transaction rate seen on the RAC-database can be achieved
when the data files are on the QFS file system .”http://www.oracle.com/industries/communications/pdfs/oracle-sun-performance-benchmark-wp.pdf
Sun QFS and Sun SAM
Page 8
Sun Storage Archive Manager (SAM) • Policy based archiving
> Media can be disk, tape, or optical > Local and remote copies> Classification is path, owner, group, size, wildcard & access
• Media format is tar – open format> Small files are put into a tar container so data is streamed at
device speeds out to the tape • Keeps all data available, but not on high cost storage
> Archives data across the tiers according to access patterns• On-demand, transparent file retrieval• Continuous data protection – no waiting until midnight
Sun QFS and Sun SAM
Page 9
SAM Customer Benefits • Leverage existing hardware – defer high cost disk
purchase> Add (cheaper) storage tiers transparently
• Provide timely protection and timely access to information throughout its lifecycle
• Meet compliance requirements with WORM support• Eliminate backup window problem & complexity
> Reduce operational costs• Quick disaster recovery for business continuance
Sun QFS and Sun SAM
Page 10
SAM-QFS Data ConsolidationIntegrated Data Management
QFSQFS QFSQFS
UNIX Workgroup NT Workgroup
Meta-Data Disk(s)
User-Data Disk(s)
Earth Pluto Venus
Meta-Data Server Client Client
Sharefs_1
Mixed Workgroup
TCP/IP NFS NFS SAMBA/CIFS
Fibre Channel
or Ethernet Switch
SAN
SolarisQFSQFS
SAMSAM
Data Protection
Transparent Transparent
Data ProtectionData Protection
And Instant RecoveryAnd Instant Recovery Data Consolidation
Sun QFS and Sun SAM
Page 11
SAM's Archives are OPEN! • Media format is open, not proprietary – tar format
> Files can be recovered with or without SAM – our media format is open, NOT proprietary
• Metadata about the data is on the archives> If file system metadata is lost, the archives can be
recovered with a procedure we call the “Ultimate Disaster Recovery”
Sun QFS and Sun SAM
Page 12
• Centralized browser-based configuration and management of
SAM and QFS on multiple hosts • Configuration Wizards for
> Archive policy> File system creation> Adding tape libraries
• Archive media operations and reporting
• Support for multiple levels of privilege using roles
SAM-QFS Management SimplifiedWizard Guided Set-up and Browser-Based User Interface
Sun QFS and Sun SAM
Page 13
Support for Monitoring SAM and QFS
• The monitoring console(shown here) lets admins quickly understand their SAM environment–Potential trouble spots are indicated by severity icons in the left
hand panel. • e-mail notifications can be configured to alert admins of problems
with file systems, archiving and archive media • System metrics provide archive media reports and file data
distribution charts• Faults provide a record of adverse conditions that have occurred
in the system (including tape alerts)
Sun QFS and Sun SAM
Page 14
Infinite Archive System (IAS)Core Features & Functions> Complete archive solution in a rack
> Software, Servers, Storage, Services> Simple ordering process
> Works with existing tape libraries> Automated data migration> Continuous backup protectionKey Benefits> Lower Cap-Ex
> 20% - 33% savings* > Lower Op-Ex
> Simple administration> Easy installation
Filer Filer Filer
FC SATA TAPE
Policy andArchiving Services (SAM-QFS)
SAN Fabric
IAS
Client Client Client
* over list price of separate components
Sun Microsystems, confidential, internal use only
HPC Storage Solutions
IB Network SAN
Archive
High Bandwidth Scalable Storage Cluster
with Lustre
Long-Term Data Retention with SAM-QFS
Compute Cluster
Object Storage FarmData Movers
Load
Near Line Archive
Home DirectoriesTier 1 Archive
Metadata Servers
Sun QFS and Sun SAM
Page 16
5.0 Features• Solaris 10 & Open Solaris (Sparc/AMD/Intel)
> Solaris 10 released April, 2009> OpenSolaris planned for 2009.06
OpenSolaris release• Online grow and shrink (with & without SAM)• Rolling upgrades for Shared QFS• Directory Lookup Performance Improvements• SAM performance improvements
> Archiver, stager, samfsdump performance
Sun QFS and Sun SAM
Page 17
5.0 Featurescontinued...• Sideband MySQL database for fast SAM
queries• Solaris Feature Integration
> Zones Support> Solaris Service Management Facility (SMF)> VPM virtual memory performance improvement for X64
• GUI Enhancements:> Usability study enhancements (including first time
configuration checklist)> Online Grow/Shrink, Shared Client On/Off, WORM
Sun QFS and Sun SAM
Page 18
Online Grow • Support on-line grow of file system by adding a new
LUN – a meta, data, or stripe group LUN> samu command: add eq
> After grow of standalone QFS, the LUN state is ON> After grow of Shared QFS, the LUN state is UNAVAIL
– Solaris clients execute samd buildmcf followed by a samd conf (applications may continue to run)
– Linux clients must umount, samd conf– SANergy clients must umount, fuse– After all mounted clients have updated mcfs, the
LUN can be changed to ON with the samu command: alloc eq
Sun QFS and Sun SAM
Page 19
Online Shrink • Support on-line shrink of file system (remove data
LUN or stripe group in a ma file system)> samu command: release eq
> After release command, the LUN state is NOALLOC> Archived files on the LUN are released and the files are
marked offline>After all files have been released, the LUN state is OFF
> samu command: remove eq> After remove command, the LUN state is NOALLOC>The LUN data is copied to other available LUNs> After all data has been moved, the LUN state is OFF
> Results of the shrink are in the shrink.log (configured in /etc/opt/SUNWsamfs/shrink.cmd)
Sun QFS and Sun SAM
Page 20
Shared QFS Rolling Upgrades• Support upgrading Shared QFS without taking
down the rest of the cluster> Requires that the MDS and alternate MDSs are
upgraded first> Clients can then be upgraded individually> Limit QFS versions to n and n+1 with the MDSs
always being upgraded to n+1 first> Base support starting at 5.0
Sun QFS and Sun SAM
Page 21
Directory Performance Improvements • Trust the directory caching
> Trust the cache when the entry exists and do not read the entry from disk to verify
• Increased our recommended files per directory to 500,000
• Performance measurements> Rewrite of 500,000 existing files in the same
directory is 33% faster; Removal is 800% faster> Postmark test of 1 million files and 100 directories
(10,000 files per directory) is 10% faster
Sun QFS and Sun SAM
Page 22
Zones • There are 2 types of zones: global and non
global• There are two ways to use zones:
> Standalone QFS is mounted first in the global zone and then mounted in the non global zone with the loopback mount option
> Shared QFS is mounted first in the global zone beneath a zonepath>Shared QFS is visible in only one non-global zone
• SAM runs in the global zone>SAM admin commands only execute in the global zone
Sun QFS and Sun SAM
Page 23
SMF – Solaris Management Facility • SAM and QFS automatic startup, monitoring,
and restart capabilities are managed and observable via the SMF facility> Manage error conditions under SMF> sam-fsd which was started in /etc/inittab will be
changed to be (re-)started by SMF> fsmgmtd was already changed from /etc/inittab
and is (re-)started by SMF
Sun QFS and Sun SAM
Page 24
SAM-QFS Manager Enhancements in 5.0
• Add support for the configuration and mgmt. of large scale Shared QFS systems> Provide a status overview with drill-down to allow
users to quickly understand the status of the file system and its clients
> Allow addition of multiple clients at once> Perform operations on sets of managed clients
e.g. mount/umount/change mount options> Provide a search/filter capability to display clients
by name, ip address, QFS version and status
Sun QFS and Sun SAM
Page 25
SAM-QFS Manager Enhancements in 5.0 continued
• Provide GUI support for online grow and shrink• Support WORM mount options and file attributes in
the file browser• Introduce a first time configuration checklist• Allow users to see the catalog and current VSN
assignments when pooling or assigning media• Simplify the monitoring console, add library wizard
and new file system wizard based on feedback from a usability study
Sun QFS and Sun SAM
Page 26
Archiver Scalability Improvements • Examine list feature improves performance by
changing the sam-arfind's worklist from a list of directories to the actual list of modified files> In 4.6, directories are scanned to find modified
files; this feature minimizes file system metadata impact by the archiver
Sun QFS and Sun SAM
Page 27
Stager Performance Improvements • Align stager writes to the disk cache to tape
block boundaries> Currently, they are never aligned due to the 512
byte tar header• Stager performance is improved by eliminating the
read/modify/write when the stager writes to the disk cache
Sun QFS and Sun SAM
Page 28
Project ID's• Solaris project ID's are supported• The Solaris project ID is inherited from the task
and is stored in the inode• The project ID may be reset with the chproj
command
Sun QFS and Sun SAM
Page 29
MySQL Sideband Database• Solaris Door interface provides fast event-based
filesystem notifications• Events generated by QFS for create, modify,
rename, remove, archive, and release/online• Sam-dbupd update daemon performs specific
actions based on the type of event.> Consistency checks performed if any problems
• Prepared SQL statements are used to reduce parsing overhead
Sun QFS and Sun SAM
Page 30
MySQL Database Schema• Relational tables – inode, file, path, and archive
> Connected by ino/gens a m _ i n o d e (
i n o I N T U N S I G N E D N O T N U L L ,g e n I N T U N S I G N E D N O T N U L L ,t y p e T I N Y I N T U N S I G N E D N O T N U L L ,s i z e B I G I N T U N S I G N E D D E F A U L T 0 ,c s u m C H A R ( 3 2 ) ,c r e a t e _ t i m e I N T U N S I G N E D D E F A U L T 0 ,m o d i f y _ t i m e I N T U N S I G N E D D E F A U L T 0 ,u i d I N T U N S I G N E D N O T N U L L ,g i d I N T U N S I G N E D N O T N U L L ,o n l i n e T I N Y I N T U N S I G N E D N O T N U L L ,P R I M A R Y K E Y ( i n o , g e n ) )
s a m _ p a t h (i n o I N T U N S I G N E D N O T N U L L ,g e n I N T U N S I G N E D N O T N U L L ,p a t h V A R C H A R ( 4 0 9 6 ) ,P R I M A R Y K E Y ( i n o , g e n ) ,I N D E X ( p a t h ) )
s a m _ f i l e (p _ i n o I N T U N S I G N E D N O T N U L L ,p _ g e n I N T U N S I G N E D N O T N U L L ,n a m e _ h a s h S M A L L I N T U N S I G N E D N O T N U L L ,n a m e V A R C H A R ( 2 5 6 ) N O T N U L L ,i n o I N T U N S I G N E D N O T N U L L ,g e n I N T U N S I G N E D N O T N U L L ,P R I M A R Y K E Y ( p _ i n o , p _ g e n , n a m e _ h a s h , n a m e ) ,I N D E X ( i n o , g e n ) )
s a m _ a r c h i v e (i n o I N T U N S I G N E D N O T N U L L ,g e n I N T U N S I G N E D N O T N U L L ,c o p y T I N Y I N T U N S I G N E D N O T N U L L ,s e q S M A L L I N T U N S I G N E D N O T N U L L ,m e d i a _ t y p e C H A R ( 4 ) N O T N U L L ,v s n C H A R ( 3 2 ) N O T N U L L ,p o s i t i o n B I G I N T U N S I G N E D N O T N U L L ,o f f s e t I N T U N S I G N E D N O T N U L L ,s i z e B I G I N T U N S I G N E D N O T N U L L ,c r e a t e _ t i m e I N T U N S I G N E D D E F A U L T 0 ,s t a l e T I N Y I N T U N S I G N E D D E F A U L T 0 ,P R I M A R Y K E Y ( i n o , g e n , c o p y , s e q ) ,I N D E X ( m e d i a _ t y p e , v s n ) )
Sun QFS and Sun SAM
Page 31
Samdb Database Utility• Samdb for managing filesystem databases
> Check – Consistency check against filesystem> Create – New database for filesystem> Dump – Dump contents of database (for samfsdump)> Drop – Delete a database> Load – Load database contents (from samfsdump)> Query – Query database based on file / vsn
> samdb query samfs1 -v VSN001 /samfs1/dira/file1 /samfs1/dirz/file87
Sun QFS and Sun SAM
Page 32
Samfsdump Performance• Utilize a MySQL database to collect create, remove,
rename, and archiving events. The file system metadata is still the primary copy
• Designed to improve full file system samfsdump performance by reducing samfsdump data capture window
• Database and inode information are post-processed into the samfsdump file format
• Existing samfsdump functionality remains available
Sun QFS and Sun SAM
Page 33
Sam Parameters & Defaults• maxphys = 1MB or /etc/system setting whichever is bigger• wr_throttle = 2% of main memory (i.e., 16GB memory = 320MB wr_throttle)• archmax value increased by device (LTO,T10k,3592,TS1120=22GB, 9940=11GB,
9840=4GB, DLT=11GB, all others 8GB, optical & disk 1GB• Startage = 2h, startcount = 500000 files, startsize = 90% of archmax• archivemeta = off, sort = path, removed -join command• archiver.cmd defaults with wait on so admin can build archiver.cmd file• Archiver (16,8192) and stager (10,1024) bufsize and limit values increased• maxactive = 5000/gigabyte of memory, maximum=500000• stage_n_window default size = 8192k• defaults.conf: avail_timeout – Stager delays before unloading a volume (default=0)• Split disk archive streams based on 1GB increments• Recycler mingain based on media type (< 200GB 60%, >= 200GB 90%)
Sun QFS and Sun SAM
Page 34
Performance Test Status
- Postmark test run on 03/25/09 using the 6800 and 8 T3's. - Postmark test run on 03/26- ma-mm-mr file system type with 8 meta data devices (mm) and 8 data (mr) devices.- ms-md file system type with same 8 devices as for the ma-mm-mr runs- There were 200,000 files in one directory and also did 200,000 transactions.
SAM-QFS 4.6 SAM-QFS 5.0 SAM-QFS 4.6 SAM-QFS 5.0Times (sec) build 4.6.73 Build 5.0.4 Improvement Times (sec) build 4.6.73 Build 5.0.4 Improvement
Creation 62 42 1.5 x Creation 62 43 1.4 xTransaction 4,256 1,060 4.0 x Transaction 3,407 319 10.7 x
Deletion 132 83 1.6 x Deletion 92 88 1.0 xOverall 4,450 1,185 3.8 x Overall 3,561 450 7.9 x
• Results with the PostMark file system benchmark
• Additional performance tests SAM-QFS 4.6 SAM-QFS 5.0Test Times (sec) build 4.6.73 Build 5.0.4 Improvement
Stager wall time 845 528 1.6 x
Stager CPU time 655 2 330 x
wall time 1315 125 11 x
100 million file archive test wall time 18 hr 14 hr 1.3 x
Run with 4 large files and stager set so that if direct I/O is used, it is properly aligned with direct I/O.
samfs dump after archiving 6 million files(effect of MySQL side band)
Sun QFS and Sun SAM
Page 35
5.0u1 Planned Features• Release of 5.0u1 planned for fall, 2009• Solaris 10 & Open Solaris (Sparc/AMD/Intel)• Red Hat Linux 5.0 support added• Extended Attributes• Increased HA support with Solaris Cluster• Device Qualifications