Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research...

23
Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research Scientific Computing Division High Performance Systems Section [email protected]

Transcript of Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research...

Page 1: Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research Scientific Computing Division High Performance Systems.

Virtualization in the NCAR Mass Storage System

Gene HaranoNational Center for Atmospheric Research

Scientific Computing Division

High Performance Systems Section

[email protected]

Page 2: Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research Scientific Computing Division High Performance Systems.

CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research

2

Outline

• Overview

• NCAR Mass Storage System (MSS) Architecture

• Device Virtualization in the NCAR MSS

• Benefits of Virtualization

Page 3: Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research Scientific Computing Division High Performance Systems.

CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research

3

National Center for Atmospheric ResearchNational Center for Atmospheric Research

Page 4: Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research Scientific Computing Division High Performance Systems.

CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research

4

Page 5: Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research Scientific Computing Division High Performance Systems.

CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research

5

Archive Statistics• MSS Statistics – End of Aug 2003

– 1.457 PBs total stored in 19.9 M files– 45 TB/month total net growth rate– 1.2 M user MSS host reads and writes/month– 80 TBs moved/month on MSS hosts– 70 TBs moved/month internal migration, 2nd

copy, data ooze - (total varies widely)– 150 TBs moved/month

Page 6: Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research Scientific Computing Division High Performance Systems.

CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research

6

NCAR MSS Total Growth

0 TB

200 TB

400 TB

600 TB

800 TB

1000 TB

1200 TB

1400 TB

1600 TB

Sep-8

6

Mar

-87

Sep-8

7

Mar

-88

Sep-8

8

Mar

-89

Sep-8

9

Mar

-90

Sep-9

0

Mar

-91

Sep-9

1

Mar

-92

Sep-9

2

Mar

-93

Sep-9

3

Mar

-94

Sep-9

4

Mar

-95

Sep-9

5

Mar

-96

Sep-9

6

Mar

-97

Sep-9

7

Mar

-98

Sep-9

8

Mar

-99

Sep-9

9

Mar

-00

Sep-0

0

Mar

-01

Sep-0

1

Mar

-02

Sep-0

2

Mar

-03

Page 7: Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research Scientific Computing Division High Performance Systems.

CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research

7

Computing Roadmap

0.0

0.2

0.4

0.6

0.8

1.0

1.2

2001 2002 2003 2004 2005

Sustained TFLOPs

Page 8: Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research Scientific Computing Division High Performance Systems.

CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research

8

Concept

• Unlimited number of files

• Unlimited file size

• Unlimited file retention

• Near zero access latency

• Hardware cost reduced 90% or more

• Personnel cost increased 90% or more

• Charge-back eliminated

Page 9: Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research Scientific Computing Division High Performance Systems.

CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research

9

Concept

• Data Virtualization

• Can only store “0” value bits!!

1

Page 10: Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research Scientific Computing Division High Performance Systems.

CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research

10

Architecture

Page 11: Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research Scientific Computing Division High Performance Systems.

CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research

11

The NCAR MSS

• Archive, not a file server• Custom system operational since 1986• Based on the IEEE Mass Storage Model• Tape based archive with a small amount of disk

cache for small files. Major area of focus – multi-TeraByte internal disk cache

• Separation of control and data paths• Implemented 3rd party transfers to optimize data

movement and increase scalability

Page 12: Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research Scientific Computing Division High Performance Systems.

CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research

12

MSCP

Traditional Server

Control/Metadata/Data

Control/Data

Host

Host

Host Server

Page 13: Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research Scientific Computing Division High Performance Systems.

CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research

13

MSCP

3rd Party TransferControl/Data Separation

MFD

HPDF

Control/Metadata

Datacontrol

Host

Host

Host MSCP

Page 14: Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research Scientific Computing Division High Performance Systems.

CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research

14

High Performance Data Fabric (SAN)

• NCAR MSS has used a SAN since 1986.

• Direct tape and disk access via the SAN

• 30+ heterogeneous hosts (including 2nd campus 16km away)

• Separation of control and data paths

Page 15: Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research Scientific Computing Division High Performance Systems.

CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research

15

Transparency

• The Archive stores bitfiles

• String of bits without regard to content

• No need to maintain/understand record boundaries

• No need to maintain/understand record structure

Page 16: Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research Scientific Computing Division High Performance Systems.

CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research

16

Device Virtualization

• The NCAR MSS utilizes 2 types of device virtualization– Vendor supplied (in hardware)– Custom built (in software – application level)

Page 17: Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research Scientific Computing Division High Performance Systems.

CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research

17

Vendor Supplied

• Device emulation– RAID devices emulate SCSI disks– Tape devices emulate IBM 3490/3590 and

SCSI

• Virtual tape– Available from vendors but not used by the

NCAR MSS– How would you utilize a 1 TB capacity tape

medium to it’s maximum capacity?

Page 18: Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research Scientific Computing Division High Performance Systems.

CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research

18

Custom Built

• Application level software – no OS mods• String of bits stored without regard to device type• Storage Manager component presents a socket

interface to clients. In-band between client and storage device.

• Underlying storage device (disk, tape, ???) is not exposed to clients

• FC RAID and FC tape

Page 19: Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research Scientific Computing Division High Performance Systems.

CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research

19

Benefits

Page 20: Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research Scientific Computing Division High Performance Systems.

CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research

20

Benefits

• Quickly introduce new storage technologies– Coding modifications are localized– Vendor supplied virtualization may not require

coding modifications (eg. Virtual tape, RAIT)

• Data can move easily move between storage media. In concert with bitfile transparency.

• Reduces system complexity

Page 21: Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research Scientific Computing Division High Performance Systems.

CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research

21

Benefits

• Enables new functionality– Split data streams

• Multiple simultaneous copies – Disaster Recovery, reliability

• RAIT – performance and reliability

• Simultaneous disk and tape migration/staging

Page 22: Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research Scientific Computing Division High Performance Systems.

CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research

22

Summary• Virtualization is good!

• Vendor supplied virtualization is even better!

• Virtualization – Reduces complexity– Optimizes resource utilization– Lowers costs

Page 23: Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research Scientific Computing Division High Performance Systems.

CAS2K3 September 9, 2003Copyright© 2003 University Corporation for Atmospheric Research

23

[email protected]