Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

60
Storage Area Network Usage A UNIX SysAdmin’s View of How A SAN Works

Transcript of Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

Page 1: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

Storage Area Network Usage

A UNIX SysAdmin’s View of How A SAN Works

Page 2: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

Disk Storage

Embedded Internal Disks within the System Chassis

Directly Attached External Chassis of Disks connected to a Server via a Cable

Directly Attached Shared External Chassis connected to more than one Server via a Cable

Networked Storage NAS SAN others

Page 3: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

Disk Storage – 2000-2004

Type Bus Speed

Distance Cable Pins

ATA 100 MB/s 18 inches 40

SCSI 320 MB/s 12 m 68 or 80

FC 400MB/s 10K m 4

SATA-II 300MB/s 6 m 22

SAS 300MB/s 10 m 22

Page 4: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

Deficiencies of Direct Connect Storage

Single System Bears Entire Cost of Storage Small Server in an EMC Shop Large Server cannot easily share its unused storage

Managability Fragmented and Isolated

Scalability Limited What happens when you run out of peripheral bus slots?

Availability “SCSI Bus Reset” Failover is a complicated add-on, if available at all

Page 5: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

DASD

Direct Access Storage Device They still call it this in an IBM Mainframe Shop

Basic Limits of Disk Storage Recognized Latency

Rotation Speed of the disk Seek Time

Radial Movement of the Read/Write Heads Buffer Sizes

Stop sending me data, I can’t write fast enough!

Page 6: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

SCSI

SCSI – Small Computer System Interface From Shugart’s 1979 SASI implementation

SASI: Shugart Associates System Interface

Both Hardware and I/O Protocol Standards Both have evolved over time Hardware is source of most limitations I/O Protocol has long-term potential

Page 7: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

SCSI - Pro

Device Independence Mix and match device types on the bus Disk, Tape, Scanners, etc…

Overlapping I/O Capability Multiple read & write commands can be

outstanding simultaneously

Ubiquitous

Page 8: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

SCSI - Con

Distance vs. Speed Double the Signaling Rate

Speed: 40, 80, 160, 320 MBps Halve the Cable Length Limits

Device Count: 16 Maximum Low voltage Differential Ultra3 SCSI can support

only 16 devices on a 12 meter cable at 160 MBps

Server Access to Data Resources Hardware changes are disruptive

Page 9: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

SCSI – Overcoming the Con

New Hardware & Signaling Platforms

SCSI-3 Introduces Serial SCSI Support Fibre Channel Serial Storage Architecture (SSA)

Primarily an IBM implementation FireWire (IEEE 1394 – Apple fixes SCSI)

Attractive in consumer market

Retains SCSI I/O Protocol

Page 10: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

Scaling SCSI Devices

Increase Controller Count within Server Increasing Burden To CPU

Device Overhead Bus Controllers can be saturated

You can run out of slots Many Queues, Many Devices

Queuing Theory 101 (check-out line) - undesirable

Page 11: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

Scaling SCSI Devices

Use Dedicated External Device Controller Hides Individual Devices

Provide One Large Virtual Resource Offloads Device Overhead One Queue, Many Devices - good Cost and Benefit

Still borne by one system

Page 12: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

RAID

Redundant Array of Inexpensive Disks

Combine multiple disks into a single virtual device

How this is implemented determines different strengths Storage Capacity Speed

Fast Read or Fast Write Resilience in the face of device failure

Page 13: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

RAID Functions

Striping Write consecutive logical byte/blocks on consecutive physical disks

Mirroring Write the same block on two or more physical disks

Parity Calculation Given N disks, N-1 consecutive blocks are data blocks, Nth block is

for parity When any of the N-1 data blocks are altered, N-2 XOR calculations

are performed on these N-1 blocks The Data Block(s) and Parity Block are written Destroy one of these N blocks, and that block can be reconstructed

using N-2 XOR calculations on the remaining N-1 blocks Destroy two or more blocks – reconstruction is not possible

Page 14: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

RAID Function – Pro & Con

Striping Pro: Increases Spindle Count for Increased Thruput Con: Does not provide redundancy

Mirroring Pro: Provides Redundancy without Parity Calculation Con: Requires at least 100% disk resource overhead

Parity Calculation Pro: Cuts Disk Resource Overhead to 1/N Con: Parity calculation is expensive

N-2 calculations are requiredIf all N-1 data blocks are not in cache, they must

be read

Page 15: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

RAID Types

RAID 0 Stripe with No Parity

RAID 1 Mirror two or more disks

RAID 0+1 Stripe on Inside, Mirror on Outside

RAID 1+0 Mirrors on Inside, Stripe on Outside

RAID 3 Synchronous, Subdivided Block Access; Dedicated Parity Drive

RAID 4 Independent, Whole Block Access; Dedicated Parity Drive

RAID 5 Like RAID 4, but Parity striped across multiple drives

Page 16: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

RAID 0RAID 1

Page 17: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

RAID 3RAID 5

Page 18: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

RAID 1+0 RAID 0+1

Page 19: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

Breaking the Direct Connection

Now you have high performance RAID The storage bottleneck has been reduced You’ve invested $$$ to do it How do you extend this advantage to N

servers without spending N x $$$?

How about using existing networks?

Page 20: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

How to Provide Data Over IP

NFS (or CIFS) over a TCP/IP Network This is Network Attached Storage (NAS) Overcomes some distance problems Full Filesystem Semantics are Lacking

…such as file locking Speed and Latency are problems Security and Integrity are problems as well

IP encapsulation of I/O Protocols Not yet established in the marketplace Current speed & security issues

Page 21: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

NAS and SAN

NAS – Network Attached Storage File-oriented access Multiple Clients, Shared Access to Data

SAN – Storage Area Network Block-oriented access Single Server, Exclusive Access to Data

Page 22: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

NAS: Network Attached Storage

File Objects and Filesystems OS Dependent OS Access & Authentication

Possible Multiple Writers Require locking protocols

Network Protocol: i.e., IP

“Front-end” Network

Page 23: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

SAN: Storage Area Network

Block Oriented Access To Data

Device-like Object is presented

Unique Writer

I/O Protocol: SCSI, HIPPI, IPI

“Back-end” Network

Page 24: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.
Page 25: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

A Storage Area Network

Storage StorageWorks MA8000 (24), EVA (2) HDS is 2nd Approved Storage Vendor

9980 Enterprise Storage Array – EMC class storage

Switches Brocade 12000 (8), 3800 (20), & 2800 (34)

3900’s are being deployed – 32 port

UNIX Servers on the SAN Solaris (56), IRIX (5), HP-UX (5), Tru64 (1)

Storage Volume Connected to UNIX Servers 13000 GB as of May, 2003

Windows Servers Windows 2000 (74), NT 4.0 (16)

Page 26: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

SAN Implementations

FibreChannel FC Signalling Carrying SCSI Commands & Data Non-Ethernet Network Infrastructure

iSCSI SCSI Encapsulated By IP Ethernet Infrastructure

FCIP – FibreChannel over IP FibreChannel Encapsulated by IP Extending FibreChannel over WAN Distances Future Bridge between Ethernet & FibreChannel iFCP - another gateway implementation

Page 27: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

NAS & SAN in the Data Center

Page 28: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

FCIP In The Data Center

Page 29: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

FibreChannel

How SCSI Limitations are Addressed Speed Distance Device Count Access

Page 30: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

FibreChannel – Speed

266 Mbps – ten years ago1063 Mbps – common in 19982125 Mbps – available today4 Gbps – near future products Backward compatible to 1 & 2 Gbps

10 Gbps – 2005? Not backward Compatible with 1/2/4Gbps But 10 Gig Ethernet will compete Remember FDDI & ATM

Page 31: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

Why I/O Protocols are Coming to IP

IP Networking is ubiquitous

Gigabit ethernet is here 10Gbps ethernet is just becoming available

Don’t have to invest in a second network Just upgrade the one you have

IP & Ethernet software is well understood Existing talent pool for vendors to leverage

Developers, not end-user Network Engineers

Page 32: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

FibreChannel – Distance

1063 Mbps 175m (62.5 um – multi-mode) 500m (50.0 um – multi-mode) 10 km (9 um – single-mode)

2125 Mbps 500m (50.0 um – multi-mode) 2 km (9 um – single-mode)

Page 33: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

FibreChannel – A Network

Layer 1 – Physical (Media: fiber, copper) Fibre: 62.5, 50.0, & 9.0 um Copper: Cat6, Twinax, Coax, other

Layer 2 – Data Link (Network Interface & MAC) WWPN: World Wide Port Name WWNN: World Wide Node Name

In a single port node, usually WWPN = WWNN 64-bit device address Comparable to 48-bit Ethernet device addresses

Layer 3 – Network (IP & SCSI) 24-bit fabric address Comparable to an IP address

Page 34: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

FibreChannel Terminology: Port Types

N_Port Node port – Computer, Disk, or Storage Node

F_Port Fabric port – Found only on a Switch

E_Port Expansion Port – Switch to Switch port

NL_Port Node port with Arbitrated Loop Capabilities

FL_Port Fabric port with Arbitrated Loop Capabilities

G_Port Generic Switch Port: Can act as any of F_Port, E_Port, or FL_Port

Page 35: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.
Page 36: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

FibreChannel - Topology

Point-to-Point

Arbitrated Loop

Fabric

Page 37: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

FibreChannel – Point-to-point

Direct Connection of Server and Storage Node

Two N_Ports and One Link

Page 38: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

FibreChannel - Arbitrated Loop

Up to 126 Devices in a Loop via NL_Ports

Token-access, Polled Environment (like FDDI)

Wait For Access Increases with Device Count

Page 39: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

FibreChannel - Fabric

Arbitrary Topology

Requires At Least One Switch

Up to 15 million ports can be concurrently logged in with the 24-bit address ID.

Dedicated Circuits between Servers & Storage via Switches

Interoperability Issues Increase With Scale

Page 40: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

FibreChannel – Device Count

126 devices in Arbitrated Loop

15 Million in a fabric (24-bit addresses) Bit 0-7: Port or Arbitrated Loop addr Bit 8-15: Area, identifies FL_Port Bit 16-23: Domain, address of switch

239 of 256 address available

256 x 256 x 239 = 15,663,104

Page 41: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

FibreChannel Definitions

WWPN

Zone & Zoning

LUN

LUN Masking

Page 42: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

FibreChannel - WWPN

World-Wide Port Number

A unique 64-bit hardware address for each FibreChannel Device

Analogous to a 48-bit ethernet hardware address

WWNN - World-Wide Node Number

Page 43: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

FibreChannel – Zone & Zoning

Switch-Based Access Control

Analogous to an Ethernet Broadcast Domain

Soft Zone Zoning based on WWPN of Nodes Connected Preferred

Hard Zone Zoning Based on Port Number on Switch

to which the Nodes are Connected

Page 44: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

FibreChannel - LUN

Logical Unit

Storage Node Allocates Storage and Assigns a LUN

Appears to the server as a unique device (disk)

Page 45: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

FibreChannel – LUN Masking

Storage Node Based Access Control List (ACL)

LUNs and Visible Server Connections (WWPN) are allowed to see each other thru the ACL.

LUNs are Masked from Servers not in the ACL

Page 46: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

LUN Security

Host Software

HBA-based firmware or driver configuration

Zoning

LUN Masking

Page 47: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

LUN Security

Host-based & HBA Both these methods rely on correct security

implemented at the edges Most difficult to manage due to large numbers and

types of servers Storage Managers may not be Server Managers Don’t trust the consumer to manage resources

Trusting the fox to guard the hen house

Page 48: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

LUN Security

Zoning An access control list Establishes a conduit

A circuit will be constructed thru this Allows only selected Servers see a Storage Node Lessons learned

Implement in parallel with LUN Masking Segregate OS types into different Zones Always Promptly Remove Entries For Retired Servers

Page 49: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

LUN Security

LUN Masking The Storage Node’s Access Control List

Sees the Server’s WWPN Masks all LUNs not allocated to that server Allows the Server to see only its assigned

LUNs Implement in parallel with Fabric Zoning

Page 50: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

LUN - Persistent Binding

Persistent Binding of LUNs to Server Device IDs

Permanently assign a System SCSI ID to a LUN.

Ensures the Device ID Remains Consistent Across Reconfiguration Reboots

Different HBAs use different binding methods & syntax

Tape Drive Device Changes have been a repeated source of NetBackup Media Server Failure

Page 51: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

SAN Performance

Storage Configuration

Fabric Configuration

Server Configuration

Page 52: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

SAN - Storage Configuration

More Spindles are Better

Faster Disks are Better

RAID 1+0 vs. RAID 5 “RAID 5 performs poorly compared to RAID 0+1 when both are

implemented with software RAID”Allan Packer, Sun Microsystems, 2002

Where does RAID 5 underperform RAID 1+0? Random Write

Limit Partition Numbers Within RAIDsets

Page 53: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

SAN - Fabric Configuration

Common Switch for Server & Storage Multiple “hops” reduce performance Increases Reliability

Large Port-count switches 32 ports or more 16 port switches create larger fabrics

simply to carry its own overhead

Page 54: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

SAN - Server ConfigurationChoose The Highest Performance HBA Available PCI: 64-bit is better than 32-bit PCI: 66 MHz is better than 33 MHz

Place in the Highest Performance Slot Choose the widest, fastest slot in the system Choose an Underutilized Controller

Size LUNs by RAIDset disk size BAD: LUN sizes smaller than underlying disk size

Page 55: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

SAN Resilience

At Least Two Fabrics

Dual Path Server Connections Each Server N_Port is Connected to a Different Fabric Circuit Failover upon Switch Failure

Automatic Traffic Rerouting

Hot-Plugable Disks & Power Supplies

Page 56: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

SAN Resilience – Dual Path

Multiple FibreChannel Ports within Server

Active/Passive Links

Most GPRD SAN disruptions have affected single-attached servers

Page 57: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

SAN – Good Housekeeping

Stay Current With OS Drivers & HBA Firmware

Before You Buy a Server’s HBA Is it supported by the switch & storage vendors?

Coordinate Firmware Upgrades Storage & Other Server Admin Teams Using SAN

Monitor Disk I/O Statistics Be Proactive; Identify and Eliminate I/O Problems

Page 58: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

SAN Backups – Why We Should

Why We Should Offload Front-end IP Network Most Servers are still connected to 100baseT IP 1 or 2 Gbps FC Links Increase Thruput Shrink Backup Times

Why We Don’t Cost

NetBackup Media Server License: starts at $5K list

Page 59: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

Backup Futures

Incremental Backups No longer stored on tape Use “near-line” cheap disk arrays

Several vendors are under current evaluation

Still over IP 1 Gbps ethernet is commonly available on new

servers 10 Gbps ethernet needed in core

Page 60: Storage Area Network Usage A UNIX SysAdmins View of How A SAN Works.

Questions