Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research...
-
Upload
lilian-richard -
Category
Documents
-
view
222 -
download
8
Transcript of Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research...
Virtualization in the NCAR Mass Storage System
Gene HaranoNational Center for Atmospheric Research
Scientific Computing Division
High Performance Systems Section
CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research
2
Outline
• Overview
• NCAR Mass Storage System (MSS) Architecture
• Device Virtualization in the NCAR MSS
• Benefits of Virtualization
CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research
3
National Center for Atmospheric ResearchNational Center for Atmospheric Research
CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research
4
CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research
5
Archive Statistics• MSS Statistics – End of Aug 2003
– 1.457 PBs total stored in 19.9 M files– 45 TB/month total net growth rate– 1.2 M user MSS host reads and writes/month– 80 TBs moved/month on MSS hosts– 70 TBs moved/month internal migration, 2nd
copy, data ooze - (total varies widely)– 150 TBs moved/month
CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research
6
NCAR MSS Total Growth
0 TB
200 TB
400 TB
600 TB
800 TB
1000 TB
1200 TB
1400 TB
1600 TB
Sep-8
6
Mar
-87
Sep-8
7
Mar
-88
Sep-8
8
Mar
-89
Sep-8
9
Mar
-90
Sep-9
0
Mar
-91
Sep-9
1
Mar
-92
Sep-9
2
Mar
-93
Sep-9
3
Mar
-94
Sep-9
4
Mar
-95
Sep-9
5
Mar
-96
Sep-9
6
Mar
-97
Sep-9
7
Mar
-98
Sep-9
8
Mar
-99
Sep-9
9
Mar
-00
Sep-0
0
Mar
-01
Sep-0
1
Mar
-02
Sep-0
2
Mar
-03
CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research
7
Computing Roadmap
0.0
0.2
0.4
0.6
0.8
1.0
1.2
2001 2002 2003 2004 2005
Sustained TFLOPs
CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research
8
Concept
• Unlimited number of files
• Unlimited file size
• Unlimited file retention
• Near zero access latency
• Hardware cost reduced 90% or more
• Personnel cost increased 90% or more
• Charge-back eliminated
CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research
9
Concept
• Data Virtualization
• Can only store “0” value bits!!
1
CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research
10
Architecture
CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research
11
The NCAR MSS
• Archive, not a file server• Custom system operational since 1986• Based on the IEEE Mass Storage Model• Tape based archive with a small amount of disk
cache for small files. Major area of focus – multi-TeraByte internal disk cache
• Separation of control and data paths• Implemented 3rd party transfers to optimize data
movement and increase scalability
CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research
12
MSCP
Traditional Server
Control/Metadata/Data
Control/Data
Host
Host
Host Server
CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research
13
MSCP
3rd Party TransferControl/Data Separation
MFD
HPDF
Control/Metadata
Datacontrol
Host
Host
Host MSCP
CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research
14
High Performance Data Fabric (SAN)
• NCAR MSS has used a SAN since 1986.
• Direct tape and disk access via the SAN
• 30+ heterogeneous hosts (including 2nd campus 16km away)
• Separation of control and data paths
CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research
15
Transparency
• The Archive stores bitfiles
• String of bits without regard to content
• No need to maintain/understand record boundaries
• No need to maintain/understand record structure
CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research
16
Device Virtualization
• The NCAR MSS utilizes 2 types of device virtualization– Vendor supplied (in hardware)– Custom built (in software – application level)
CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research
17
Vendor Supplied
• Device emulation– RAID devices emulate SCSI disks– Tape devices emulate IBM 3490/3590 and
SCSI
• Virtual tape– Available from vendors but not used by the
NCAR MSS– How would you utilize a 1 TB capacity tape
medium to it’s maximum capacity?
CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research
18
Custom Built
• Application level software – no OS mods• String of bits stored without regard to device type• Storage Manager component presents a socket
interface to clients. In-band between client and storage device.
• Underlying storage device (disk, tape, ???) is not exposed to clients
• FC RAID and FC tape
CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research
19
Benefits
CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research
20
Benefits
• Quickly introduce new storage technologies– Coding modifications are localized– Vendor supplied virtualization may not require
coding modifications (eg. Virtual tape, RAIT)
• Data can move easily move between storage media. In concert with bitfile transparency.
• Reduces system complexity
CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research
21
Benefits
• Enables new functionality– Split data streams
• Multiple simultaneous copies – Disaster Recovery, reliability
• RAIT – performance and reliability
• Simultaneous disk and tape migration/staging
CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research
22
Summary• Virtualization is good!
• Vendor supplied virtualization is even better!
• Virtualization – Reduces complexity– Optimizes resource utilization– Lowers costs
CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research
23