© 2013 IBM Corporation
IBM SmartCloud Virtual Storage CenterSVC Product Update, Review, and Replication Discussion for BCRS
Gary LandonStorage Virtualization Architect – North America
22
SAN Volume Controller (SVC)
Level set of history
Technical Overview
New Hardware
Major Software functions
With a Deeper Dive on Replication functions
Competitive Discussion
Questions
Agenda/Objective
33
Industry-leading storage virtualization offering
12 Years, Over 80 percent Market Share (External Virtualization)
Only storage virtualization system with integrated Real-time Compression designed to be used with active primary data
Best performing storage virtualization system in industry-standard benchmarks First storage virtualization system with fully integrated SSD support Integrated iSCSI server attachment support and support for FCoE protocol Fully upgradable without disruption from smallest to largest configurations “Future proof” with ability to replace current hardware with new hardware without
disruption Network-based virtualization with VSC supports diverse server environments
including VMware, other virtualization, and non virtualized servers IBM has shipped over 40,000 SVC engines running in more than 10,000 SVC
systems From 2006 to present, across this entire installed base, SVC delivered better than
five nines (99.999%) availability SVC can virtualize IBM and non-IBM storage (over 170 systems from IBM, EMC, HP,
HDS, Sun, Dell, NetApp, Fujitsu, NEC, Bull)
SVC – Twelve Years in the Market – ‘Gold’ Standard for Storage Virtualization
4
Storage Controllers – Traditional Architecture
Advanced Functions
DiskDisk
RAID
Dev
ice
Inte
llig
ence
Dev
ice
Inte
llig
ence
Advanced Functions include-- Cache
-- Point in Time Copy (In the Box)
-- Remote Mirror (Like to Like)
-- Multi-Path Drivers
-- LUN Masking
-- Dynamic Volume Expansion
-- Connectivity to variety of platforms
-- etc.
Think of SVC as taking intelligence out of the Storage Controller, placing it in a standalone appliance, and integrating it with a powerful Logical Volume Manager (LVM) (which typically resides on the Server).
This enables SVC to provide a common set of advanced functions for multiple (possibly heterogeneous) Storage Controllers!
SVC
5
Virtualization Layer (SVC)Storage Pooling Thin Provisioning
Sync/Async Replication (MM/GM) Real Time Compression
Point-in Time Copy (FlashCopy) Auto Tiering
Logical Volume Mirroring IO Caching (Read/Write)
Data Mobility Flash Cache (future)
Active-Active (Stretch Cluster) Encryption (future)
VAAI (VMWare APIs) Thick-Thin-RTC Conversion
Integration with VMControl QOS
Vvol/VASA App Consistent Snapshots
DISK ARRAY
IO Caching
RAID and Pooling
Auto Tiering (allowed)
Disk Encryption
Above Virtualization Layer
Control Plane (TPC)Performance Monitoring/AlertingIntegrated Reporting EngineEnd to End Topology (Agentless)Cloud API (Openstack, VMWare, etc)Automated Storage/SAN ProvisioningService Classes/CatalogStorage PoolsCapacity Pools (pick best pool)Approval Tickets for ProvisioningRight Tiering AnalyticsStorage Pool Balancing AnalyticsStorage Transformation AnalyticsConfiguration Change AnalyticsSAN Planning AnalyticsReplication ManagementFile Level Analytics/ILM
Control Plane (TPC)Performance Monitoring/AlertingIntegrated Reporting EngineEnd to End Topology (Agentless)Cloud API (Openstack, VMWare, etc)Automated Storage/SAN ProvisioningService Classes/CatalogStorage PoolsCapacity Pools (pick best pool)Approval Tickets for ProvisioningRight Tiering AnalyticsStorage Pool Balancing AnalyticsStorage Transformation AnalyticsConfiguration Change AnalyticsSAN Planning AnalyticsReplication ManagementFile Level Analytics/ILM
66 This proprietary educational material is intended for IBM and IBM Business Partner staff only. It is not intended for distribution to customers or other third parties.
SVC 2145-DH8 Storage Engine Based on IBM System x3650M4 (2U) server (19” rack mount enclosure)
– Intel® Xeon® Ivy Bridge Eight Core processor with 32GB cache– Two Internal batteries (BBU)– 3x 1Gb Ethernet ports for iSCSI and Management– 1x 1Gb Technician Ethernet Port – 2x 600GB/10K SAS internal drives for boot and hardened data dump
Flexible Hardware Options– Up to three IO adapters - 8Gbps FC (max 3), 16Gbps FC (max 4), and/or 10Gb E (iSCSI /FCoE)
connectivity – Second Eight Core processor with 32 GB RAM (initially for use with Compression Accelerator and/or
a third IO card)– Compression Accelerator (up two cards; requires 2nd processor feature)– Expansion enclosure attachment card – SAS 12Gb– Expansion Enclosure (Model EE1) – maximum of two expansion enclosures
• 12Gb SAS with twenty-four 2.5-inch (SFF) drive slots• Supports 12Gb SAS flash drives – 800GB, 400GB, 200GB
New engines may be intermixed in pairs with other engines in SVC clusters– Mixing engine types in a cluster results in VDisk throughput characteristics of the engine type in that
I/O group Cluster non-disruptive upgrade capability may be used to replace older engines with new engines Supported by SVC software Version 7.3 or later
2U
BBU2BBU1Boot drives
Volume(VDisk)
Storage Pool (MDG)
Extent
16MB – 8GB
ManagedDisk
(MDisk)
LUN
Stripe2 - 512 KB
mdisk01000GB
mdisk11000GB
mdisk21000GB
mdisk31000GB
mdisk62000GB
mdisk52000GB
mdisk42000GB
EMC1000GB
EMC1000GB
EMC1000GB
EMC1000GB
IBM2000GB
IBM2000GB
IBM2000GB
mdiskgrp0 [EMC Group]4000GB
mdiskgrp1 [IBM Group]6000GB
vdisk0125GB
vdisk22TB
vdisk31TB
vdisk4275GB
vdisk52TB
Mapping to Hosts
w/SDD or supported MultiPath Driver
SVC Logical View
vdisk110GB
Mirrored Thin Provisioned Compressed
Volume(VDisk)
Storage Pool (MDG)
Extent
16MB – 8GB
ManagedDisk
(MDisk)
LUN
Stripe2 - 512 KB
mdisk01000GB
mdisk11000GB
mdisk21000GB
mdisk31000GB
mdisk62000GB
mdisk52000GB
mdisk42000GB
EMC1000GB
EMC1000GB
EMC1000GB
EMC1000GB
IBM2000GB
IBM2000GB
IBM2000GB
mdiskgrp0 [EMC Group]4000GB
mdiskgrp1 [IBM Group]6000GB
vdisk0125GB
vdisk22TB
vdisk31TB
vdisk4275GB
vdisk52TB
Mapping to Hosts
w/SDD or supported MultiPath Driver
SVC Logical View
vdisk110GB
Mirrored Thin Provisioned Compressed
9
Virtualization ALL physical storage is amalgamated into a single or multiple storage pools
– External storage controllers
– SSDs within SVC nodes
Storage pooling allows for the full use of all available physical storage
Volumes are allocated as needed
– Fully allocated, over allocated (thin provisioning), or image mode (for migration into SVC)
Delay purchasing more storage until absolutely necessary
Data can be non-disruptively migrated between controllers
– Move data between different tiers
– Migrate in data from existing storage controllers
– Add new storage controllers to the pool
– Decommission end-of-life storage controllers
Storage Pool
Managed Disks
Volumes
10
Tight VMware IntegrationVAAI, Vvol, and VASA Allows VMware to control common storage operations Offloads I/O from virtual servers, improving application
performance and dramatically improving some data intensive functions (e.g. moving VMs)
– Optimized VM provisioning (Write Same)Uses storage system to zero out new volumes
– Block locking (Atomic Test & Set)Improved VMFS performance with less reservation conflicts
– Fast copy (optimized cloning, XCOPY)Storage-side volume to volume cloning
SRM VMware SRM coordinates the failover of
guests between primary and DR sites
SVC SRA plugin allows SRM to coordinate the guest failover with the underlying storage and any Global or Metro Mirror relationships between the two sites.
vCenter Plugin Storage admins can delegate control of selected
pools to VMware admins, giving them control over the storage from within the VMware environment.
Capacity is committed on use, helping improve overall storage utilization
Volume provisioning and resizing
System and volume information
Receive events/alerts for systems attached to vSphere
11
Thin Provisioning (or Over Allocation, Space Efficient)
Fine grain thin provisioning
Volumes are created with a virtual sizevirtual size (volume size seen by the host), and a physical sizephysical size (the amount of physical storage allocated from the storage pool)
Physical storagePhysical storage is used as the host writes non-zero data
Additional physical storage can be added if and when needed
– Per-volume thresholds can be set to warn the user when additional capacity is needed
Reduces potential for unused, allocated storage– Provide what a consumer thinks they need– Allocate what a consumer is actually using
Used
Physical
Virtual
Expect to see SVC aggressively deliver support for File System Deallocation APIs once delivered by respective vendors
12
Volume Mirroring
Mirroring data between two independent volumes
– A single volume is mapped to the host– Data is written to both volumes– If one volume becomes unavailable, the
system makes progress with the remaining volume
– When the mirror is reintroduced, the data is synced
Each volume can be fully allocated, over allocated, image mode or compressed
– Enables volume attributes to be changed non-disruptively
Underlying physical disks can be located on different storage systems
– Protects against a storage controller failure
Storage Pool
1313
Easy Tier – Automated Data Relocation
•13
Host Volume Hybrid Storage Pool
Disk from Flash Array
Disks from HDD Arrays
Extent Virtualization
Hot Extents Migrate Up
Cold Extents Migrate Down
• Easy Tier recognises that a small portion of data is used the majority of the time and maximises Flash usage to increase performance
• Easy Tier is enabled simply by creating a hybrid storage pool of Flash and HDDs
• Traditionally, different drive technologies are put in different storage pools
• Flash can be within SVC nodes or installed in an external storage array
• A heat map is constructed to identify how the data is being used
•Extents are then migrated between Flash and HDDs AUTOMATICALLY
• ‘HotHot’ data is moved to Flash• ‘ColdCold’ data is moved to HDDs
NOTE: 2 Tier example
3 Tiers are supported
NOTE: 2 Tier example
3 Tiers are supported
17
FlashCopy
Rich, highly configurable FC environment– Point-in-time copy– Works with thin provisioning– Create and trigger copy within seconds– Multiple copies of a volume– Copies of copies (of copies…)– Consistency groups can be used to trigger
multiple copies simultaneously– Each relationship is individual managed and
reversible
Copy on write technology– Allows separate fault domains– Snapshots can be on a different storage tier
Can be used with Tivoli FlashCopy Manager to trigger an application-consistent copy
– Application ‘aware’ snapshots– Tight back-end integration w/TSM
18
1. Preserve damaged data
2. Back up is restored to
source
Back up to tape continues
FlashCopy TypesSnap Shot
Cascade of back ups on thin provisioned volumes
Each target has a dependency on the source volume
Subsequent triggers update just the changes.
Ability to reverse changes to restore data
Back Up Full backup, retaining relationship to source
Subsequent back ups copy only the changes.
Clone Full backup of source volume
Deletes dependency on source volume when copy complete so volume can be used independently
19
Metro Mirror (Synchronous Remote Copy)
Synchronous, continuous copy of data between two volumes in different clusters
No loss of data if primary cluster lost
– Write is completed to host once its been committed at both sites
Suitable for up to 300km
– Application latency tolerance may limit practical distance to less
Additional 1ms latency on writes per 100km between sites
Requires suitable link bandwidth and DR storage capacity to avoid host latency at primary site
SecondaryCluster
Primary Cluster
DR site
20
Global Mirror (Asynchronous Remote Copy)
Asynchronous, continuous copy of data between two volumes in different clusters
Some loss of data if primary cluster lost– Normally less than 200ms– Data loss is consistent across multiple
relationships
Supported for up to 250ms latency (7.4)– 80ms latency 7.3 and earlier code levels
Approximately a 1ms delay on writes regardless of the distance between sites
Requires suitable link bandwidth and DR storage capacity to avoid relationship becoming out of sync
SecondaryCluster
Primary Cluster
DR site
21
Original Global Mirror – Synching Up and Running
Primary Server Recovery Server
Secondary Volume (exact same size)Primary Volume
o Grains are copied from Primary to Secondary• Write to already copied grain must be replicated on Secondary
o Secondary is Inconsistent (corrupt) until first synch is done• If Relationship stopped during initial synch, can only restart to finish synch
o Once copy is done, Secondary is “Consistent Synchronized”• All subsequent writes to Primary are replicated• Secondary still unavailable to host until Relationship is stopped• Secondary is Consistent unless Relationship stopped and restarted
– Inconsistent during re-synch, just like initial synch
Cluster Cluster
SAN Connection(s)
22
Original Global Mirror – Mirror stopped after initial synch
Primary Server Recovery Server
Secondary Volume (exact same size)Primary Volume
o Can be deliberate stop or due to outage• Deliberate stop drives in-flight writes before stopping
o Can make Secondary accessible to a host (-access)• Bitmaps on both sides keep track of changes
o Can be restarted in either direction• Bitmaps OR’d together and all grains sent from new Primary to new Secondary
Cluster Cluster
SAN Connection(s)
23
Multi-cycling Global Mirror (Asynchronous Remote Copy)
Asynchronous, continuous copy of data between two volumes in different clusters
Trades a low RPO for lower bandwidth requirements
Loss of data if primary cluster lost– User specifies minimum cycle time and ensures
suitable bandwidth to meet RPO objectives– FC of source is taken at primary site– FC of target is taken at secondary site– GM is used to copy between two sites– Both sites ensured a consistent copy of the data
No impact to the host
Resilient to bursts of IO and limited bandwidth
SecondaryCluster
Primary Cluster
DR site
24
Global Mirror with Change Volumes -1
Production Server Recovery Server
Secondary Volume Primary Volume
o Change Volumes provided by user• Cannot be used in any other way while being used by Relationship• Are not exactly like Flashcopies, but very similar to Flashcopy NOCOPY
o “Snapshots” of the Primary are sent (synch’d) to the Secondary• After synch, “snapshot” of Secondary taken to preserve Consistent copy• Cycle ends at end of synch or timer pop whichever is later• Freeze time shows time of Secondary
▫ RPO = current time minus freeze time, could be up to two cycles• “Too Many” Writes = Secondary falls further behind
o Can still Flashcopy the Primary or the Secondary Volumes• Flashcopy of Secondary gives data from last complete synch
SAN Connection(s)Cluster Cluster
Change Volume Change Volume
25
Global Mirror with Change Volumes -2
Primary Server Recovery Server
Secondary Volume Primary Volume
o Secondary still unusable until first synch is done• Stop in middle of first synch still allows only restart in same direction
o Stopped Relationship (after initial synch) can allow host access to Secondary• Uses last available “snapshot” – data from last synch• If stopped in the middle of a synch, does not finish synch
– No in-flight writes driven, Secondary still behind Primary– Affects attempts to make application-consistent copies (more on this later)
o Can be restarted in either directiono Only 2048 total Relationships, Volumes no larger than 2TB
SAN Connection(s)Cluster Cluster
Change Volume Change Volume
26
Native IP Remote Copy
Enables transparent use of 1Gbit or 10 Gbit Ethernet connections without FCIP routers for replication
– Supports all remote copy modes – MM and GM
• GM with Change Volumes preferred mode
– Covered by normal remote copy license
• It is not a new replication offering, but rather a new transport versus using a fibre channel network
Configuration:
– Automatic path configuration via discovery of remote cluster
– Configure one Ethernet port for replication on each node using remote copy port groups
– CHAP-based authentication supported
– RTT limit (80ms pre 7.4 and 250ms 7.4+ apply)
– Includes Bridgeworks SANSlide network optimization technology in V7.2 virtualization software
PrimaryVolume
SecondaryVolume
Ethernet TCP/IP Network
27
Bridgeworks SANSlide Optimization
With TCP/IP, information transfer slows the further you go. – This is because of the latency caused by waiting for acknowledgement of each set of packets sent because
the next packet set cannot be sent until the previous one has been acknowledged Enhanced parallelism by using multiple virtual connections (VC) that share the same IP links and
addresses:– When waiting for one VC’s ACK, it sends more packets across other VCs– If packets are lost from any VC, data will be retransmitted– Artificial Intelligence engine adjusts number of VCs, receive window size, and packet size as
appropriate to maintain optimum performance
(Figure from REDP5023)
29
2-site Stretched Cluster
Stretched Cluster
Improve availability, load-balance, and deliver real-time remote data access by distributing applications and their data across multiple sites.
Seamless server / storage failover when used in conjunction with server or hypervisor clustering (such as VMware or PowerVM)
Up to 300km between sites (3x EMC VPLEX)7.2 added ‘Site Awareness’ to keep IO local
whenever possible
Metro or Global Mirror
4-site Disaster Recovery
For combined high availability and disaster recovery needs, synchronously or asynchronously mirror data over long distances between two high-availability stretch clusters.
High Availability High Availability
Disaster Recovery
Data center 1 Data center 2
Server Cluster 1 Server Cluster 2
SVCStretched-cluster Stretched
virtual volume
Failover
Data center 1 Data center 2
Server Cluster 1 Server Cluster 2
Stretchedvirtual volume
Failover
Data center 1 Data center 2
Server Cluster 1 Server Cluster 2
Stretchedvirtual volume
Failover
Up to 300km
30
Embedded RACE CompressionWhat this is IBM recently acquired the company known as Storwize
(that name has now been applied to the new IBM Storwize V7000) which produced appliances that sit in front of NAS (network-attached storage) arrays and compress data being written to the array, using Lempel-Ziv algorithms in its Random Access Compression Engine (RACE).
This RACE technology will be embedded into our code to provide the same compression capabilities for block storage
Why it matters Growing customer data demands more disk space
and drives up the cost of storage in the data center
The RACE engine, which can achieve upwards of 50 per cent data reduction, allows customers to reduce the amount of disk required for their data
Actual host data written to disk drives
(without compression)
HostI/O
SVC
Volume as seen by Host
Hostsystem
Actual space used on disk drives after
RACE compression
RACE Compression Layer in SVC code
100 GB100 GB
40 GB40 GB
20 GB20 GB
31
Real-time Compression uses same proven Random-Access Compression Engine (RACE) as IBM RTC Appliances
Delivers similar levels of compression
IBM Comprestimator tool can be used to evaluate expected compression benefits for specific environments
IBM Real-Time Compression
DB2 and Oracle databases Up to 80%
Virtual Servers (VMware)
Linux virtual OSes Up to 70%
Windows virtual OSes Up to 50%
Office
2003 Up to 60%
2007 or later Up to 20%
CAD/CAM Up to 70%
Expected compression ratios
3434
EMC
VPlex and RecoverPoint (see next two slides)
Federated (Monolithic Array)
Try to convince customer there is no need to virtualize
HDS
Virtualization and Replication functions inside the array
Functions more closely mimic SVC than EMC, however lack RTC, migration strategy when time to refresh array, and few customers virtualize behind it (long term) due to complexity/cost
Recently unveiled ‘diskless’ approach
Network Appliance
NetApp attempts to do ‘everything’ in the same box
‘Redirect on Write’ for Snapshots
Integrated sync/async replication, Dedup with performance ramifications
Has moved away from ‘gateway’ approach, limited traction for large block
SVC – Competitive Landscape
35
Virtualization Layer (Data Plane)Storage Pooling Thin Provisioning
Sync/Async Replication (MM/GM) Real Time Compression
Point-in Time Copy (FlashCopy) Auto Tiering
Logical Volume Mirroring IO Caching (Read/Write)
Data Mobility Flash Cache (future)
Active-Active (Stretch Cluster) Encryption (future)
VAAI (VMWare APIs) Thick-Thin-RTC Conversion
Integration with VMControl QOS
Vvol/VASA App Consistent Snapshots
DISK ARRAY
IO Caching
RAID and Pooling
Auto Tiering (allowed)
Disk Encryption
Above Virtualization Layer
Control PlanePerformance Monitoring/AlertingIntegrated Reporting EngineEnd to End Topology (Agentless)Cloud API (Openstack, VMWare, etc)Automated Storage/SAN ProvisioningService Classes/CatalogStorage PoolsCapacity Pools (pick best pool)Approval Tickets for ProvisioningRight Tiering AnalyticsStorage Pool Balancing AnalyticsStorage Transformation AnalyticsConfiguration Change AnalyticsSAN Planning AnalyticsReplication ManagementFile Level Analytics/ILM
Control PlanePerformance Monitoring/AlertingIntegrated Reporting EngineEnd to End Topology (Agentless)Cloud API (Openstack, VMWare, etc)Automated Storage/SAN ProvisioningService Classes/CatalogStorage PoolsCapacity Pools (pick best pool)Approval Tickets for ProvisioningRight Tiering AnalyticsStorage Pool Balancing AnalyticsStorage Transformation AnalyticsConfiguration Change AnalyticsSAN Planning AnalyticsReplication ManagementFile Level Analytics/ILM
36
VPlex (Virtualization Layer)Storage Pooling IO Caching (Read Only)
Data Mobility (w/limitations) Active-Active (Stretch Cluster)
VAAI (ATS and WriteSame only)
DISK ARRAY (services vary based on array)
IO Caching Point in Time Copy (within array)
RAID and Pooling (within array) Auto Tiering (within array)
Sync/Async Replication (like to like only) Disk Encryption
Thin Provisioning Compression (inactive data)
VAAI (XCopy and Unmap) Flash Cache
Above Virtualization Layer (RecoverPoint)Sync/Async Replication (heterogeneous) Point in Time (CDP) Recovery
VIPR (Control Plane)•Common User Interface for configuring/provisioning multiple products (Phase 1)•Does NOT change where functions reside (may change licensing structure)•Does NOT eliminate need for additional tools for performance monitoring/alerting•Does NOT include analytics to drive efficiencies, best practices, and intelligent placement of data
VIPR (Control Plane)•Common User Interface for configuring/provisioning multiple products (Phase 1)•Does NOT change where functions reside (may change licensing structure)•Does NOT eliminate need for additional tools for performance monitoring/alerting•Does NOT include analytics to drive efficiencies, best practices, and intelligent placement of data
37
How is VSC different than other Virtualization approaches?
•Customers started approx 66 percent Enterprise Disk and 34 percent Mid-Range
• EMC users ended at 56 percent Enterprise and 44 percent Mid-Range
• VSC users ended at 22 percent Enterprise and 78 percent Mid-Range
•VSC leveraged technologies (such as Pooling, Compression on Active Data, etc) and reduced growth rate by approximately 45 percent more than EMC approach
•Five year look showed traditional 3 year analysis tend to understate the gains (gains get compounded!)
•Customers started approx 66 percent Enterprise Disk and 34 percent Mid-Range
• EMC users ended at 56 percent Enterprise and 44 percent Mid-Range
• VSC users ended at 22 percent Enterprise and 78 percent Mid-Range
•VSC leveraged technologies (such as Pooling, Compression on Active Data, etc) and reduced growth rate by approximately 45 percent more than EMC approach
•Five year look showed traditional 3 year analysis tend to understate the gains (gains get compounded!)
Specific to EMC, but concepts applicable to other vendors Array based approaches – compared 5 multi-PB VSC users results to 15 EMC Virtualization users (Federated Storage Tiering/VMAX and/or VPlex users) actual realized results over five years, annual growth rates ranged between 15-45 percent, data provided by the end users
•Average TCO 72 percent less with VSC
•Average five-year savings of more than $11 million per PB are realized in costs of storage capacity, personnel and facilities.
3838
IBM has chosen to keep the Flash array ‘very’ fast and simple
Most other vendors – XtremIO, Pure, Violin, etc. – are placing software stacks inside the Flash device
This ‘slows’ it down from a latency perspective, and increases cost (but cost is approximately the same as Flash plus SVC)
IBM teams will almost always include SVC with Flash sales
Either as separate (‘Real SVC/VSC with Flash and other disk behind SVC’)
V840 (Flash/SVC tightly bundled and direct connect as single unit)
There are architectural reasons for each – not discussed today
Point is – you will likely have SVC in ‘every’ customer that has IBM Flash
IBM Flash (840, V840, etc.)
39
Transform your environment
Storage virtualizationAutomated storage provisioning
Self-service portal (optional)
Pay-per-use invoicing (optional)
Thin provisioningEfficient remote mirroring
Snapshot managementAdvanced GUI
IBM Real-time Compression (optional)
IBM Storage Analytics EngineIBM Easy Tier
No “rip and replace” of existing storage
IBM SmartCloud Virtual Storage Center
Top Related