Sdc 2012-how-can-hypervisors-leverage-advanced-storage-features-v7.6(20-9-2012)
-
Upload
abhijeet-kulkarni -
Category
Technology
-
view
488 -
download
1
description
Transcript of Sdc 2012-how-can-hypervisors-leverage-advanced-storage-features-v7.6(20-9-2012)
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
How can Hypervisors leverage
Advanced Storage features?
Author: Shriram Pore, Solutions Architect, Calsoft Inc. Presenter: Dr. Anupam Bhide, CEO, Founder Calsoft Inc.
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
Introduction
Useful for storage vendors who are considering implementing hypervisor storage APIs like VAAI
Understand how hypervisors interact with storage today Limitations in that interaction today
Need for a standard that is both hypervisor and storage agnostic.
Some Areas (that today’s hypervisor-specific standards do not cover)
2
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
Virtual environments (Hypervisors) with NAS/SAN
arrays have high storage bandwidth usage
(IP/FC/etc.) as compared to DAS
SAN/NAS arrays have matured technologies such as
snapshot, clone, server copy, range locking, etc.
Hypervisors can leverage these matured technologies
and improve storage / network utilization.
However, hypervisors themselves have developed sophisticated storage virtualization layers
Virtual Environment Challenges
Challenges
The goal here is to –
Offload file/bulk-block operations to NAS/SAN arrays/servers to reduce storage bandwidth
and increase I/O performance, storage utilization, etc.
Solution
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
Hypervisors use of storage
Classification – how hypervisors use storage Hypervisor using local disks (e.g. VMware VMFS on
local disk) Hypervisor creating proprietary file system on SAN
(e.g. VMware VMFS on a SAN array) Hypervisor using LUNs of SAN array in RDM mode Hypervisor using NAS storage box over NFS/CIFS
Better integration can have big benefits in many cases E.g. Vendors have reported 99% savings in
network bandwidth for cloning operations and 10x to 20x in efficiency
4
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
Need for standard to optimize hypervisor interaction with storage
SNIA interface for SAN
ESX(i) Hypervisors
Interface layer
Hyper-V Xen
SNIA interface for NAS
Storage
SAN NAS
Identify
• Standard needs to be both hypervisor independent and storage independent
• Identify in-efficient file/ Block operations on VMs in virtual environments
Define
• Define set of SAN/ NAS primitives (standard APIs) based on above identification including capability exchange.
Compliance
• Levels of compliance depend on interface implementations for storage and hypervisor
Why the need? • Storage is the biggest bottleneck for
hypervisor performance
• Many common hypervisor operations can be optimized by delegating operations to storage boxes
• Storage vendors cannot afford to conform to multiple hypervisor standards for this delegation
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
SAN based Architecture
Application (VM)
Application (Web Services)
Virtual Disk Library
Hypervisor proprietary FS e.g. VMFS
Block Layer (Block Devices, logical volume)
Set of files (disk metadata)
Files and file segments
Logical Blocks
Device Blocks
User Kernel Boundary
Physical SCSI Device Access
Vendor Specific SCSI extensions on SAN array
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
SAN Primitives
Goal Use Cases
Block Zeroing
• Avoid multiple WRITE • Use SCSI WRITE SAME
• Block Zero feature speeds up deployment of thick provision eager zeroed virtual disks
• Full Copy feature speeds up storage vMotion and cloning of VMs
• Hardware-Assisted Locking feature is useful for operations that require VM locking for power on/off etc. and cluster wide operations like vMotion, storage vMotion etc.
• Power On Storm (100s of VMs being powered up which are using same LUN) has huge latency which is resolved by hardware assisted locking
• Dead Space Reclamation enables reclamation of blocks from a thin-provisioned LUN on the SAN based arrays
Full Copy
• Avoid multiple READ & WRITE and network bandwidth to copy complete VMDK
• use SCSI EXTENDED COPY
Hardware Assisted Locking
• BLOCK/EXTENT LEVEL locking • use SCSI ATOMIC TEST & SET
Thin Provisioning
• Errors to indicate soft and hard out of space conditions
• Pause VM in case of hard errors
• use UNMAP for Space Reclamation
Offloaded Data Transfer to the arrays
Representative SAN Primitives and Use Cases
(illustrations from VMware SAN VAAI and Hyper-V ODX)
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
Types of VMware virtual disks
Thick provisioned All backing storage immediately allocated Not zeroed immediately
Thin provisioned Backing storage not fully allocated
Thick provisioned with eager zeroing
Like thick provisioned, but also zeroed immediately
8
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
Efficiencies gained
Block zeroing Storage vendor can lazily zero extent in background Can implement proprietary mechanisms to mark extent as
zeroed out In thin provisioned LUNs, storage vendor can unmap the
extent Full copy
Doing the copy within the storage box is more efficient Furthermore, writeable snapshot(i.e. clone) technology can
be used to do the full copy without actually copying blocks Dedupe performance can be improved by recognizing that
two extents are just copies of each other
9
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
Efficiencies gained
Atomic Test and Set
Many VM operations such as powering on or off VMs require getting or releasing VM-specific lock. This requires either getting SCSI-2 or SCSI-3 persistent group reservations – both inefficient.
Storage vendors can optimize by implementing atomic test and set.
UNMAP Thin provisioned LUNs can re-use deleted space
10
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
Integration of Hypervisors and SAN capabilities
Steps Involved
Identify
• Identify in-efficient bulk-block operations on VMs in virtual environments/hypervisors
Define
• Define set of SAN primitives (unused existing SCSI commands to be overloaded or new SCSI commands to be adopted) based on above identification including capability exchange (supported primitives) - can be added to SNIA specifications for standardization
Implement
• SAN vendors to implement SCSI commands to deliver desired functionality, which will use its own technologies to achieve maximum performance
Call
• Hypervisor to invoke SCSI commands to leverage SAN capabilities
Adopt
• SAN arrays/servers to adopt proven/efficient technologies for increased performance and reduced usage of network bandwidth (features like Dynamic LUN Provisioning, Thin Provisioning, concurrent provisioning, Space reclamation, Dynamic Snapshots, LUN migration, etc.)
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
NAS based Architecture – Plugin Approach
Application (VM) Application (Web
Services)
Virtual Disk Library
NAS proprietary FS over NFS/CIFS
Block Layer (Block Devices, logical volume)
Vendor Specific NFS or Custom RPC plugin
Virtual disk library plugin API Set of files (disc
metadata)
Files and file segments
Logical Blocks
Device Blocks
User Kernel Boundary
Physical SCSI Device Access
Plugin Approach • Plugin approach
allows vendors to use their own communication mechanism
• Many vendors use unused NFS commands
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
NAS Primitives
Goal Use Cases
File Space Reservation
• Monitor space utilization in sparse file and guarantee adequate space for VMs
• Allows future I/O ops of VM NOT to fail due to space unavailability
• Thick provisioning is the normal standard in enterprise deployments of hypervisors
• With NAS file systems, POSIX lseek is the only way for hypervisors to efficiently create large files – but no backing storage will be created with lseek
• File Space Reservation and Extended Stats features enable hypervisors to quickly reserve space on NAS server over NFS/CIFS to create thick provisioned virtual disk for a VM
• VM cloning/deployment and storage vMotion are examples of operations that are offloaded to NAS server.
• Lazy clones typically used for instant
operations. E.g. in VMs Linked Clones and VDI • Full Clones – used for operations across data
stores
File Cloning (Full and
Lazy)
• Clone VMs in faster and storage space efficient way
• Also can be used for faster & storage efficient snapshots & restores (VM snapshot)
Extended Statistics
• Retrieve accurate space utilization of VMs - data that cannot be retrieved using NFS calls
• Monitor space utilization in sparse file and guarantee adequate space for VMs
Representative NAS Primitives and Use Cases
(illustrations from VMware NAS VAAI)
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
Efficiencies gained
File Space Reservation Thick provisioned semantics are hard to achieve in NAS due
to NFS protocol limitations Cloning – Full and Lazy
Doing the cloning within the NAS box is more efficient Furthermore, file level writeable snapshot(i.e. clone)
technology can be used to do the lazy copy without actually copying blocks
Dedupe performance can be improved by recognizing that two extents are just copies of each other
Extended Attributes Understand exactly how much space a lazy clone uses Understand how much space a VMDK file is actually using
14
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
Additional Observations
Delegate creation of VM snapshots to the storage box. These snapshots need to be hypervisor-consistent. For NAS, it is inefficient for hypervisors to do file
snapshots Flash-based storage boxes do not write in place
and can do snapshots much more efficiently. VDI applications could use the writeable
snapshots(clone) already provided by most storage vendors
15
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
Additional Observations
With storage box provided snapshots, backups can be taken without involving the server
Primitive for disaster recovery that can use replication or remote mirroring (similar to SRA/SRM)
UNMAP for even thick provisioning for flash based SAN arrays – helps reduce the size of book-keeping data structures.
Storage arrays can demultiplex IO streams from hypervisor and provide VM level IO stream QoS - vVol
16
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
Microsoft Offloaded Data Transfer
-ODX
Available Windows Server 8 onwards Works for both SAN storage(VHDs) and NAS using
SMB protocol Provides Full copy semantics for both SAN and NAS Protocol works as follows:
Send offload read request to source device Return with a token Send offload write request with token to
destination device
17
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
Author Biography
Shriram Pore - Solutions Architect, Calsoft Inc. A veteran of storage industry More than 11 years of experience in architecting and developing products Key strength lies in quickly understanding product requirements and translating
them into architectural and engineering specs for implementation. Architected, designed and implemented solutions for NAS-VMware integrated
Backup-Recovery and Clone of VM Led FS and replication teams on the file-server and also led the CORE component
team from system management perspective (configuration, provision, and manage various facilities of file-server).
Master Of Computer Science from India, Pune University Bachelor of Computer Science from India, Pune University
2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved.
Presenter Biography
Dr. Anupam Bhide – CEO, Co-Founder, Calsoft Inc. Storage industry veteran More than 21 years of industry experience Senior Architect in the RDBMS development group at Oracle Corp, designed
some of the key features of Oracle8 Founder-member of the DB2/6000 Parallel Edition team at IBM Research
Center Visiting Faculty at University of California – Berkeley Ph.D. in Computer Science, University of California-Berkeley BS in Computer Science: Indian Institute of Technology, Bombay and MS:
University of Wisconsin-Madison
19
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
Thank You
20
Questions & Answers
Contact info Dr. Anupam Bhide
CEO, Co-Founder, Calsoft Inc.
Email: [email protected]
Phone: +1 (408) 834 7086
Twitter: @Calsoftinc
2012 Storage Developer Conference. © Calsoft Inc. All Rights Reserved.
Integration of Hypervisors and NAS capabilities
Steps Involved
Identify • Identify in-efficient file-operations on VMs in virtual environments
Define
• Define set of NAS primitives (standard APIs) based on above identification including capability exchange (supported primitives) This may be added to SNIA specs for standardization
Provide • Provide a framework to implement APIs as plugin on hypervisors
Implement
• NAS vendor to implement the plugin to use its technologies to achieve maximum performance
Call • Hypervisor to call into the plugin via defined framework APIs to leverage NAS capabilities
Modify
• Modify NFS/CIFS to accommodate NAS primitives desired, else provide a new interface which would be easily integrated into the plugin with no security compromise
Adopt
• NAS server to adopt proven/efficient technologies for increased performance and reduced usage of network bandwidth (e.g.: file-clone, server-copy, FS snapshots, Thin Versioning, de-dupe, etc.)