Post on 07-Mar-2018
© 2013 IBM Corporation
Software-Defined Storage (SDS)
Christian Bolik (bolik@de.ibm.com), IBM Storage Software Development
© 2013 IBM Corporation 2
Objectives
• Understand the driving forces behind the desire to move to an SDS
• Understand the purpose of SDS, and its relation to SDE and Cloud
• Gain insight into the 2 primary perspectives on SDS
• Learn about what is required for establishing an SDS
© 2013 IBM Corporation 3
The increasing complexity, volume, and value of data
8 zettabytes of digital content created by 2015
© 2013 IBM Corporation 4
2010 2000 2005 2015
Terabytes
Petabytes
Exabytes
Zettabytes
Gigabytes
The information explosion meets budget reality
The Information Explosion…
• Information doubling every 18-24 months
• Storage growing 20-40% per year
• Storage budgets up 1%-5%
● Velocity of Change
● Acquisitions
● Mergers
● Consolidations
● ILM, Data
Retention
initiatives
● “Born on the
Web” type
applications
● Legal
Requirements
● Regulations
demanding data
to be retained for
many years
● Ever increasing
variety of data
stored digitally…
© 2013 IBM Corporation 5
Source:
http://www.domo.com/blog/2012
/06/how-much-data-is-created-
every-minute/
„Big Data“
© 2013 IBM Corporation 6
Storage management pain points
• Top pain points are
dominated by
– Growth management
– Cost
– Complexity
• Problems seem even
more severe for midsize
enterprises compared to
large enterprises
The InfoPro Storage Study 1H12 – 451 Research
© 2013 IBM Corporation 7
Managing increasing amounts of storage takes time… and money
The InfoPro Storage Study 1H12 – 451 Research
Survey respondents cited 77%
of storage staff time devoted to
administration of ongoing
operations. Things that could
be automated.
© 2013 IBM Corporation 8
The special needs of Virtual Servers
$20.082
$13.331
Virtualized Non-Virtualized
2011 Storage Spend
IDC Storage Workloads
10/2012
• 60% of storage spend in 2011 was
for attachment to virtual servers
• Nearly all customers reported some
storage issue with VM usage.
• Virtual servers bring their own
unique storage requirements, and
need special consideration for: • New capacity
• New operational processes – DR
• New performance management
• Planning considerations
From: Research Report: 2012 Storage Market Survey. Source: Enterprise Strategy Group Created for Connie Bright, IBM. © 2012 Enterprise Strategy Group, Inc. All Rights Reserved
© 2013 IBM Corporation 9
Systems of Engagement (Situational Need) Systems of Record (Traditional Operations)
Changing Workload Requirements
Workload Optimized & Transaction Integrity
Agility & Rapid Scale
Enabled for Cloud Orchestration across
compute/network/storage for provisioning, deployment, and management of workloads
Automation of provisioning and configuration of storage based on application requirements, with ongoing adjustments based on policies/SLA
Programmable adjustments to storage (via APIs) as application needs change
Heterogeneous environment support
Efficient management of data copies (backup/archive/compliance)
Born on Cloud Orchestration across
compute/network/storage for provisioning, deployment and management of workloads (DevOps)
Dynamic scalability as applications and data requirements grow
Cost-optimized storage via disks embedded in servers
Multi-tenant security at a fine-grained, highly scaled level
Open support of industry standards and APIs
Value is shifting to software to provide the dynamic and agile storage
environment required by these workloads
© 2013 IBM Corporation 10
Introducing: Software-Defined Storage (SDS)
• IDC Definition of SDS:
A software-defined data center is „...a loosely coupled set of software
components that seek to virtualize and federate datacenter-wide
hardware resources such as storage, compute, and network
resources.... The goal for a software-defined datacenter is to....make
the datacenter available in the form of an integrated service....“
• Key attributes
– It is software
– Offers a full suite of storage services
– Federates physical storage capacity from multiple locations/technologies
Based on „IDC„s Worldwide Software-Based (Software-Defined) Storage Taxonomy, 2013“
Flexibility, lower cost
Flexibility through virtualisation
Abstraction of storage capabilities
© 2013 IBM Corporation 11
Today’s World,
with No Software Defined Storage
The New World
with Software Defined Storage
“Programmable Storage”
1. A Workload Definition Layer (or application)
defines storage capacity requirements
2. Storage administrators define logical
volumes with required storage capacity
and do a best guess of performance
requirements
3. Storage administrators map the logical
volumes to the application
4. All the following events will need storage
administrators’ intervention: a) Storage capacity needs to be increased or
decreased
b) Application performance degrades due to resource
contention
c) Performance requirements change (increase or
decrease)
d) Data protection needs change
e) Replication policies change
f) RPO and RTO of the data changes
g) Backup and archive policy changes
1. A Workload Definition Layer (or application)
will specify storage requirements explicitly: a) Performance
b) Capacity
c) RPO/RTO
d) Replication, etc.
2. A Workload Orchestrator will schedule
workload with appropriate compute, network
and storage resources to satisfy Service
Level Objectives
3. If performance of an application is impacted,
storage service will automatically detect it
and adjust resources to maintain its Service
Level Agreements
4. If the requirements are changed,
applications will communicate with storage
via APIs. Storage service will adjust the
resources accordingly
Software Defined Storage = programmable smart storage
© 2013 IBM Corporation 12
Key characteristics of an SDS-enabled Storage Service
• Commoditized persistent data storage (lower cost)
• Service-based infrastructure (easy to consume)
• Open standards and interfaces based platform (no vendor lock-in)
• Focus on solution rather then technical platform (application-oriented)
• Scalability (capacity, throughput, performance)
• Resilient (always available)
• Workload-aware („fit for purpose“, optimized)
• Covering block, file and object storage
• Cost-efficient and highly automated
© 2013 IBM Corporation 13
SDS in SDE – Software Defined Environments
Tighter coordination between
applications and storage/network,
– Exposing storage capabilities for the
software to dynamically provision
storage with the most suitable
characteristics
– Introducing new operations between
software and storage to let storage
better adapt to the needs of software
– Integrating storage functions to the
software to leverage higher-level
knowledge
Control planes separated from the
hardware to the software layer. Unified
Control Planes allow rich resource
abstractions to assemble purpose fit
systems
Programmable infrastructures allow for
dynamic optimization to respond to
business requirements
C
C
Resource Abstraction
SDC Control
and Config
SDN Control
and Config
SDS Control
and Config
Workload Abstraction
SDE Unified Control Plane:
Cross-Domain Orchestration
Heterogeneous
Compute Resources &
capabilities
Virtualized Network
Heterogeneous
Storage Resources &
capabilities
Control Plane
Data Plane
APIs APIs APIs
© 2013 IBM Corporation 14
Relation of SDS and SDE to Cloud Layering
Business Process
as a Service
Enabling business transformation
Business
Process
Solutions Application Application Application Application Application
Software
as a Service
Marketplace of high value consumable business applications
Platform
as a Service
Composable and integrated application development platform
Infrastructure
as a Service
Enterprise class, optimized infrastructure, via Software-Defined Environments (SDE)
External
Ecosystem Industry Collaboration
Human
Resources
Big Data &
Analytics Commerce Marketing
Developmen
t
Big Data &
Analytics Security Integration Mobile Social
Traditional
Workloads
Built using open standards
Software-Defined
Compute
Software-Defined
Storage
Software-Defined
Networking
Built using open standards
Public. Private. Dynamic hybrid.
© 2013 IBM Corporation 15
Different views of the same coin...
Consumer Provider
Self-service Highly automated storage lifecycle
management
Flexible and dynamic, elastic Virtualized and standards-based, simple
capacity planning and forecasting
Cost-efficient, no overprovisioning Automated and optimized, space-efficient
Charged by capacity and service level
used
Capacity reporting and metering,
multi-tenancy-enabled
Reliable and always available Highly available, replicated, self-monitoring
and self-healing, secure
No need to have any knowledge of
infrastructure details
Automated mapping of consumer
requirements to infrastructure capabilities
Expectations on a Storage Service:
© 2013 IBM Corporation 16
Key Aspect of IT Service Management in General:
Mapping
Business Requirements
Infrastructure Capabilities
to Separation of concerns Consumer
Provider
© 2013 IBM Corporation 17
What this means for Storage Service Management
Mapping
Business Requirements
Infrastructure Capabilities
to
Capacity
Accessibility
Availability
Performance
Security
Retention/Compliance
Media type
Disk technologies
RAID levels
Encryption
Compression
Thin Provisioning
Number of Copies
Access latency
Access protocols
Backup/Replication
etc....
© 2013 IBM Corporation 18
Establishing a service catalog of supported service classes which
service consumers can choose from
• Accessibility
• Availability
• Performance
• Consistency
• Retention /
Compliance
• Security
Service Class “Platinum”
Service Class “Gold”
Service Class “Silver”
Service Class “Bronze”
Service Catalog
$$$$
$$$
$$
$
Different service classes map to
different levels of service in
some or all of the different
service level catagories:
© 2013 IBM Corporation 19
Defining Requirements for Storage Services:
Service Level Categories – Service Level Objectives (SLOs)
Accessibility Initial Access Time
Data Sharing
Requires Access Transparency
Max-Out-Of Space Duration
Availability Availability Period
Planned Downtime
Max. Unplanned Downtime Aggregate
Max Unplanned Downtime Per Instance
Recovery Point Objective (RPO)
Recovery Time Objective (RTO)
Consistency
Number Of Copies
Number Of Versions
Retain Deleted
Performance Avg. I/O Rate
Avg. Data Throughput
Retention / Compliance
Immutability
Disposal
Durability
Retention Period
Security
Accountability
Integrity
Authenticity
Confidentiality
Physical Security
© 2013 IBM Corporation 20
Mapping storage resource and management capabilities to SLOs
Accessibility Initial Access Time
Data Sharing
Requires Access Transparency
Max-Out-Of Space Duration
Availability Availability Period
Planned Downtime
Max. Unplanned Downtime Aggregate
Max Unplanned Downtime Per Instance
Recovery Point Objective (RPO)
Recovery Time Objective (RTO)
Consistency
Number Of Copies
Number Of Versions
Retain Deleted
Performance Avg. I/O Rate
Avg. Data Throughput
Retention / Compliance
Immutability
Disposal
Durability
Retention Period
Security
Accountability
Integrity
Authenticity
Confidentiality
Physical Security
Tape/Disk/Flash, HSM, NAS exports, vaulting, thin provisioning, ....
Metro Mirror, Global Mirror, Snapshots (app-aware?), Backup/Restore (file/image-level), versioning, ....
Different disk media (RPMs etc.), tape, flash, RAID levels, Cache, ...
WORM storage, automated deletion, data shredding, media lifetime, ...
Encryption, key management, access controls, lockable cabinets, etc.
© 2013 IBM Corporation 21
Provider„s Goal: Maximize storage capacity, minimize down-time:
Store data with as little cost as possible while maintaining committed
SLAs (Service Level Agreements) – How?
• Thin provisioning: Allocate only as much storage as is used,
expand allocation as needed
• Compression: Reduce actual capacity used
• Data deduplication: Store only one copy of files/blocks
containing the same data
• Tiering: Always place data on the lowest cost
storage tier which still fulfills customer
requirements, optimize continuously
• Monitoring: Threshold-based alerting to detect
impending performance bottlenecks early,
balance volumes to address
• Virtualization: Employ virtualization to have the freedom of
moving data to lower cost storage without any downtime
50-60%
Optimal Storage Tier Distribution
Tier 0
Tier 2
Tier 3
Tier 1
20-25%
15-20%
1-5%
© 2013 IBM Corporation 22
Flexibility through Storage Virtualization
Traditional Storage
Capacity is isolated in SAN islands
Multiple management points
Potentially poor capacity utilization
Capacity is purchased for and owned by
individual applications
With Storage Virtualization
Combines storage capacity into 1 large storage pool
Single management point
Uses storage assets more efficiently
Capacity purchases can be deferred
Plus: Non-disruptive data migration between storage
resources
SAN
95%
capacity
20%
capacity 50%
capacity
SAN
Storage Hypervisor
55%
capacity
HDS IBM EMC HP HDS IBM EMC HP
22
© 2013 IBM Corporation 23
Storage Management Interface Abstraction via SMI-S
• SMI-S (Storage Management Initiave – Specification) started in 2002, with
the goal to standardize management interfaces of storage devices
• SMI-S is currently supported by 21 different vendors
• SMI-S is developed by a workgroup of the SNIA (Storage Networking
Industry Association); in the meantime it has been accepted both as an
ANSI and an ISO standard
(http://www.snia.org/ctp/conformingproviders/index.html)
SMI-S builds on CIM (Common Information
Model), defined by the DMTF, uses XML for
formatting the payload, and HTTP as the
transport mechanism
© 2013 IBM Corporation 24
OSLC: Built on Linked Data
1. Use URIs as names for things
2. Use HTTP URIs so that people can look
up those names
3. When someone looks up a URI, provide
useful information, using the standards
4. Include links to other URIs. so that they
can discover more things
Linked Data describes a method of publishing
structured data so that it can be interlinked and
become more useful. It builds upon standard
Web technologies such as HTTP and URIs, but
rather than using them to serve web pages for
human readers, it extends them to share
information in a way that can be read
automatically by computers. This enables data
from different sources to be connected and
queried [1]
[1] Bizer, Heath, Berners-Lee (2009). "Linked Data - The Story So Far"
Open Services for Lifecycle Collaboration: http://open-services.net/
OSLC describes a method for integration
of disparate tools, across domains, by
providing a set of integration services,
through other tools can be discovered,
and more information about resources
retrieved. This is enabled by Linked Data.
© 2013 IBM Corporation 25
OpenStack storage includes
Cinder: Provision and manage block
storage for compute
Swift: object storage
Manila (future): file storage
IBM provides support for OpenStack,
and provides extensions through
standard mechanisms
OpenStack provides a common, open
interface for ISVs, applications and
cloud admins to provision and manage
storage resources
Integrated with compute and
networking management
OpenStack provides an open mechanism for provisioning/managing
storage to workloads and is driving a rapidly developing ecosystem
Capabilities Commands
IBM sol‟n
for Internal
storage
OpenStack Cinder
Smarter Mgmt
on any storage OpenStack Swift
Object Middleware
IBM Enterprise Object
Storage Solution Platform
Enable & Extend
Differentiate
Integration with
SDN and SDC
Drivers
IBM
storage;
TPC
Community/
Competitor
storage
support
Smarter Data
Protection
Community
© 2013 IBM Corporation 26
In Summary...
• The exponential and on-going growth in data storage requirements calls
for new, more flexible storage management methods
• Software-Defined Storage promises to provide the flexibility, service
orientation, and cost-efficiency required to address today„s requirements
• By abstracting storage resource capabilities through service classes and
APIs, SDS is enabled to „snap-in“ to an SDE
• The 2 primary views on SDS are that of a service consumer and a
service provider, each having overlapping goals and expectations
• Main challenges for the provider are to map consumer-specified
business requirements to storage infrastructure capabilities, and to
maintain committed SLAs
• For the provider to be able to offer an attractively priced, yet sustainable
storage service, various technologies and methods exist
© 2013 IBM Corporation 29
What‟s the problem with storage these days?
Data growth is exponential…
2003 2006 2010
0.8 GB/
person
128 GB/
person
24 GB/
person
The World‟s total data
per person.
Digital Information Created, Captured, Replicated WW
2006: 180 exabytes
2007: 280 exabytes
...
2011: 1800 exabytes
(1800 billion gigabytes)
Expected compound annual
growth rate is almost 60%
● Velocity of Change
● Acquisitions
● Mergers
● Consolidations
● ILM, Data Retention initiatives
● “Born on the Web” type
applications
● Legal Requirements
● Regulations demanding data to
be retained for many years
● Ever increasing variety of data
stored digitally… Sources:
IDC, Worldwide Disk Storage Systems 2007-2011
Forecast Update, Doc #209490
IDC Whitepaper: The Diverse and Exploding Digitall
Universe, March 2008
Drivers:
© 2013 IBM Corporation 30
Globally, storage requirement is 80% file-based unstructured data,
and growing
Explosion of data, transactions, and
digitally-aware devices strains IT
infrastructure and operations. Storage
capacity is doubling every 18 months.
Majority of this data is unstructured file-
based, such as user files, medical images,
web and rich media content, growing at
63%
Block storage, while still well suited for
existing OLTP/database workloads, is not
where majority of strategic analytics-based
applications and strategic storage initiatives
are being deployed Source: IDC, State of File-Based Storage Use in Organizations:
Results from IDC's 2009 Trends in File-Based Storage Survey:
Dec 2009: Doc # 221138
Worldwide Storage Capacity Shipped by Segment,
2008–2013
© 2013 IBM Corporation 31
Customer Storage Needs - General
From: Research Report: 2012 Storage Market Survey. Source: Enterprise Strategy Group Created for Connie Bright, IBM. © 2012 Enterprise Strategy Group, Inc. All Rights Reserved
• Across a range of customer types,
rapid growth of unstructured data, the
complexity of data protection, and
hardware costs are the biggest
challenges.
• There is a list of other issues, made
worse by the size and growth rates of
data • Space constraints, poor utilization
• Management tasks
• Long implementation times
• Lack of skills
• Staff costs
• Several System trends show through
to Storage • Support for virtual server
environments
• Support for VDI
© 2013 IBM Corporation 32
32
Platinum
Gold
Silver
Bronze
Authentication/Auditing
Encryption
Mirroring/DR
High Availability
Striping
Clustering
Compression
Tiering/ILM
Backup & Recovery
Deduplication
Security and Availability
Performance and Opt.
` Workload Abstraction Resource Abstraction Continuous Optimization Mapping to Resource
Sto
rage
Se
rvic
es
Laye
r
RESILIENCY CAPABILITY
OPTIMIZATION
FABRIC
MANAGEMENT
SOFTWARE DEFINED STORAGE
• Storage Abstraction • Storage Provisioning • Storage Monitoring • SAN/GPFS/NAS/DAS
• •FC/FCoE/iSCSI/ Infiniband •Zone management
• Storage replication • Disaster recovery • Consistency groups • Backup
HETEROGENEITY
• Storage tiers • Performance aware placement • Continous optimizations • Migration
SOFTWARE
DEFINED
COMPUTE
SOFTWARE
DEFINED
NETWORK
© 2013 IBM Corporation 33
What is needed for Software Defined Storage?
Devices
• Block Storage Systems / Storage Arrays • File Storage Systems / NAS Filers • Object Storage Systems • Tape Systems / Archive Systems • Storage Virtualizers • Storage Networks
Services
• Thin Provisioning • De-Duplication • Data Replication • Encryption • Compression • ...
Storage Resource
Management
Business Continuity
Management
Data Protection
Management
Storage Service Management
Control
Plane (incl. resource
abstraction)
-
Management
Data
Plane -
I/O
© 2013 IBM Corporation 34
Example of a Software Defined Storage Platform
Key attributes
( IDC):
• It is software
• It offers a full suite
of storage services
• It federates
physical storage
capacity
Storage Infrastructure
Policy-based Management and Automation
Snapshot and Backup Management
Management Software Platform
Security and Availability
Authentication/Auditing
Encryption
Mirroring/DR
High Availability
Striping
Clustering
Compression
Tiering/ILM
Backup & Recovery
Deduplication
Performance and Opt. Cluster File
System
Block
Virtualization
Object Storage
Storage Software Platform
Fea
ture
Op
tio
ns
Control Plane Layer
Data Plane Layer
IBM Storwize Storage Software Platform
Tivoli Storage
Productivity Center /
FlashCopy Manager
IBM
Sm
art
Clo
ud
Vir
tual
Sto
rag
e C
en
ter