It's the End of Data Storage As We Know It (And I Feel Fine)
-
Upload
stephen-foskett -
Category
Technology
-
view
692 -
download
2
description
Transcript of It's the End of Data Storage As We Know It (And I Feel Fine)
It's the End of Data Storage As We Know It(And I Feel Fine!)
Stephen FoskettCommunity Organizer, Tech Field Day
Outline
Technological change is finally coming to storage, and it will wipe away the architecture we've come to know over the last few decades. Say goodbye to the "do it all" Fibre Channel SAN storage array and get ready for converged infrastructure, distributed storage, alternative attachments like PCIe, and top-of-rack flash! In this session, Stephen Foskett will explain why this change is inevitable and how it will shake out. You won't recognize what's coming, but it will be faster, cheaper, and more integrated than ever!
Hello! My Name Is Stephen!
I’m a storage guy…
…but I love virtualization too!
You may know me as “the Tech Field Day guy”
Or perhaps for some other crazy techie nonsense
Chapter 1: Why Is Storage Like It Is?
Storage Is…
Data storage – the act of saving information for later use
Storage is not (necessarily):– Disks– SCSI– Fibre Channel– RAID– Arrays
Prehistoric Evolution of Storage
Tape• “We can store
stuff!”• Good sequential
throughput; non-existent random access
Disk• “Spinning rust”• Reasonable
compromise between sequential and random
Disk Array• “A bunch of disks
pretending to be one”
• Faster and redundant
Three Things Storage Arrays Do Well…
Acceleration• Aggregation (wide-striping)• Caching (predictive write-
back cache)• Tiering (automated SSD
tiers)
Motion• Local copies (snapshots,
mirrors, and data movement)
• Remote copies (data replication)
Sharing• Multi-client (SAN, NAS)• Multi-protocol
(iSCSI/FC/FCoE, NFS/SMB)
Ye Olde I/O Path
Server = HBA = LUNArrays can…
– Accelerate I/O by predicting and pre-filling the cache
– Move and copy data logically as a whole LUN/server
– Share data while knowing “who” is accessing it
Block-O-Matic
ProServer
ProServer
ProServer
Today’s Storage Market
Capacity
Performance
SAN and NAS try to strike a balance between capacity and performance optimization
– The storage network slows performance but allows sharing– Because they are shared, arrays must offer lots of capacity
Networked StorageArrays
Chapter 2: How Is Storage Changing?
RAID Can’t Keep Up
RAID is inflexible
RAID is bad at math
RAID has no (data) integrity
Block Storage Is Stupid*
Object = Data in Databases
File = Remote Directories
Block = Fake Disks
• Simple high-level protocols for Create, Read, Update, Delete
• Independent of data location, protection, hardware
• NAS and file servers handle file translation and organization
• Data access uses directory location, filename, offset
• Filesystem (driver in the computer) locates files
• Protocols: SATA, SCSI, Fibre Channel, iSCSI, FCoE, USB, FireWire, thumb drive, etc
*I’m being completely serious
Moving Beyond Blocks
Operating systems already speak “file”
– LAN Manager, SMB/CIFS, NFSHypervisors now speak “file”
– VMware = NFSv3– Hyper-V = SMB3
Applications speak “file” or “object”
– File = POSIX, Windows APIs, etc– Object = Amazon S3, etc
What Does Virtualization Do?
Server ≠ HBA ≠ LUNArrays see a random stream of data
– Acceleration is limited to write-back and “most-recently used” caching
– Moves and copies of whole LUNs are less useful
– Shared access leads to locking conflicts
Block-O-Matic
VM Guest
VM Guest
VM Guest
Hypervisor
“The I/O Blender” Demands New Architectures
Server virtualization throws block I/O into a blender: All I/O is now random!
– Caching pre-fetch is confounded
– Granular movement is impossible
– Shared storage is stymied
What About NFS and SMB?
File I/O to the ArrayArrays get better information but must be specially designed to act on it
– Thin provisioning and acceleration can work better
– No locking issues– VAAI and VSS may allow per-file
movement
File-O-Matic
VM Guest
VM Guest
VM Guest
Hypervisor
Array Integration APIs
Hypervisor <-> ArrayAPIs are a partial solution
– VMware VAAI = vSphere offloaded copy, snapshots, thin provisioning
– Microsoft ODX = Windows Server 2012 and Hyper-V offloaded copy & thin provisioning
– VMware VASA enhances vSphere automation
Block-O-Matic
VM Guest
VM Guest
VM Guest
Hypervisor
VAAI
ODX
VASA
The Solid-State Storage Fairy
Solid State Storage is appearing everywhere!
– SSDs in servers and arrays
– PCIe cards in servers and arrays
– Dedicated appliances and arrays
Solid state storage can be used in many different ways
– Read-only cache
– Read/write cache
– Tiered storage arrays
– All-solid state arrays
Some use NAND flash, others use DRAMStor-O-Matic
ProServer
SAN/LAN
SSD SSDSuper-SSD SSD
SSD
SSD
SSD
How Fast Is It?
USB2 Drive
SATA HDD
SATA SSD
PCIe Drive
Memory
File Copy
Windows
Server
Rack
Datacenter
Chapter 3: What Will Storage Look Like?
“Software-Defined”?
“Computer”• People use computers• Hardware-oriented,
“hold it in your hands”• Complex, user-friendly
interfaces
“Server”• Computers use servers• Operating system and
hypervisor-focused• Standards-based
protocol interfaces
“Platform”• Applications use
platforms• Software-only, “bits
and bytes”• Application
programming interfaces (APIs)
Virtualizing the Controller
The whole storage array can be a virtual machineStorage arrays can even run virtual machines
Block-O-Matic
Front-End I/O
ProServer
Back-End I/O
Hypervisor
Distributed Storage
Use storage virtualization software to combine local storage resourcesDistribute data intelligently
– Across devices for reliability– Tiered flash + disk
Scale with clientsNo expensive SAN or storage network needed!
ProServer
ProServer
ProServer
Shared Storage:Distribution and
Protection
The Return of Local Storage
Block-O-Matic
Front-End I/O
ProServer
Back-End I/O
Windowsor
HypervisorProServer
ProServer
ProServer
Shared Storage:Distribution and
Protection
Where Should Features Live?
Software
Hardware
We need reliability in software or hardware, but not necessarily both
– If operating systems don’t provide reliability, scalability, and manageability, then server hardware must
– Server virtualization can add these features, so you don’t necessarily need them in hardware!
“The marginal cost of reliable hardware is linear while the marginal cost of reliable software is zero.” – Sam Johnston
The Distributed Storage Future
Capacity
Performance
Storage is moving out of the network and closer to the servers– Software manages data sharing– Low-latency connections allow much greater performance– Low-cost JBOD is for bulk storage
Distributed Flash
Distributed JBOD
Disaggregated Storage
ProServer
ProServer
ProServer
ProServer
ProServer
ProServer
ProServer
Top-of-rack performanceSpeedy high-performance storage:• InfiniBand• PCIe flashhigh cost,high performance
Bottom-of-rack capacityScaly capacity-oriented storage:• SAS JBOD• Object store• Cloud gatewaylow cost,low performance
Tied together with
software!
Flashy!
Scaly!