Post on 29-Nov-2014
© 2010 IBM Corporation
Session xVI05Open Source Virtualization with KVM for IBM System x Tom.Schwaller@de.ibm.com - Linux Architect
IBM Systems Technical University - Budapest, 3-6 May 2010
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
Tom Schwaller (tom.schwaller@de.ibm.com)Linux IT Architect, IBM Germany
Tom Schwaller studied Mathematics and Theoretical Physics at the Swiss Federal Institute of Technology in Zurich and worked afterwards for several high performance computing research projects. From 1996-2001 he was editor-in-chief of he German Linux Magazine and also cofounded the Linux New Media AG.
Since June 2001 he works as Linux IT Architect at IBM Gemany and helped as member of the Linux Impact Team (until end of 2005) many IBM customers with their Linux migration. As IBM‘s Linux Evangelist he also supported hundreds of Linux customer briefings, gave dozends of radio, TV and press interviews, represented IBM as keynote speaker on all major German Linux events (LinuxTag, LinuxWorldExpo, LinuxPark, etc.) and was advisory board member / chairman of several conferences (SambaXP, iX Eclipse Conference, First International Virtualization Conference, etc.). As EMEA Linux Dektop Technical Leader he also coauthored the very successful IBM Linux Client Migration Cookbook.
Since 2006 his main focus is on high performance and cloud computing, virtualization, high end x86 systems, iDataPlex & BladeCenter and high speed Infiniband/10GbE networking/storage (incl. GPFS). From 2007-2008 he was Deep Computing Lead Architect in CEEMEA. At the moment he works as Lead Architect for a major Linux Desktop Cloud project in Germany.
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
Agenda x86-Virtualization KVM (Kernel-based Virtual Maschine) KVM Performance Tuning
– KSM (Kernel SamePage Merging)– VirtFS
KVM I/O-Architecture (Evolution)– Virtio– I/O Acceleration with vhost-net & SRIOV
KVM Security– SELinux & sVirt
QEMU– Creating Disk Images & manual Installation
Thanks to Anthony Liguori, Hollis Blanchard, Ram Pai
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
x86-Virtualization
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
Some Virtualization Use Cases Datacenter Consolidation
– Primary driver: increase utilization (primarily CPU, memory)– Special case: security isolation (e.g. hosting providers)
Hardware Abstraction– Driver: new hardware with older guest OS– Support via virtual device drivers, e.g. ATA disk
Leverage New Technologies (FCoE, 10 GbE) Cloud Computing
– Resources on demand, pay per use Development and Testing
– Driver: multiple development and testing environments, isolation from main workspace in host. Fault injection.
Virtual Desktop– Multiple OS– Legacy applications (DOS, Windows)
Thin Clients– Provision, manage, high availability– Accessible from everywhere
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
Evolution of x86 Hypervisors
x86
Hypervisor
Dom
ain
0
VM VM
x86
Hypervisor
Binary Translation
VM VM VM
VT x86
Linux
VT
VM VM VM
KVM
First Generation
Software based
Second Generation
Para-virtualization
Third Generation
Hardware-based /OS virtualization
Virtualization logic PV kernel PV drivers
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
KVM (Kernel-based Virtual Maschine)
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
An Alternative to Xen In 2006-2007, kernel developers started talking
about an alternative to Xen that was more closely aligned with Linux– A few issues stood out:
• NUMA Support• Control Tool Stack
A startup, Qumranet, wanted to build a VDI solution using Open Virtualization
Qumranet never intended on being a hypervisor vendor
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
Leveraging Hardware Design virtualization support around hardware virtualization
– Hardware virtualization support is pervasive– Modern Intel VT-x/AMD-V outperforms paravirtualization (PV)
Tremendous simplification comes from not supporting older hardware– Intrusive Linux patching is unnecessary
Leveraging Linux Xen is virtualization added to an exokernel Linux is a proof-by-example that monolithic kernels
are more scalable/secure/fast/stable than microkernels/exokernels– Linux dominates the top 100– Linux has a large share in the embedded space– Rising desktop/server market shares
If Linux can be used in Naval destroyers for systems control, why can't it be used to run a couple dozen Windows XP instances running Office?
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
KVM Design Philosophy
Guest is a userspace process– Not just from a management perspective
From a performance perspective, switching to and from the guest is roughly equivalent to switching between kernel and userspace
The problems that need to be solved to do something in virtualization are roughly the same as to do them in userspace– PCI passthrough == userspace PCI drivers– Unsurprisingly, interrupt sharing is the major issue for both
If we solve the problems for Linux userspace in general, we solve the problems for virtualization
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
Kernel-based Virtual Maschine (KVM) - Overview Converts Linux into a Type-1 Hypervisor Incorporated into the Linux kernel in 2006 Runs Windows, Linux and other guests KVM architecture leverages the power of Linux
– Built on trusted, stable enterprise grade platform– Scheduler, memory management, hardware support etc. – Ease of management - same Linux paradigm
Advanced features – Inherit scalability, NUMA support, power management, hot-plug from Linux
• Red Hat Hypervisor (KVM) expected to support >96 cores / 1 TB RAM on host and 16 vCPU / 64 GB RAM on guest!
– SELinux security, Real-Time scheduler, RAS support, OpenGL for guests– Live Migration of virtual machines– VM Storage access (iSCSI, AoE, FCoE, GNBD, cLVM,..) from Linux
Hybrid-mode operation– Run regular Linux applications side-by-side with Virtual Machines on
the same server - much higher degree of hardware efficiency
http://www.linux-kvm.orghttp://www.linux-kvm.com
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
Linux KVM - Architecture
Type 1 Hypervisor– Not “bare metal” in a classical sense, but hypervisor is kernel-integrated
Introduces new instruction execution mode – Guest Mode– Executes VMs closer to Kernel avoiding User Mode context switching like
traditional non-kernel integrated Type 2 Hypervisor Slightly modified QEMU is used for HVM construct and I/O
– virtio utilizes user mode virtio drivers inherent in Kernel/QEMU for performance
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
KVM Execution Model
Native GuestExecution
KernelExit Handler
UserspaceExit Handler
Switch toGuest Mode
ioctl()
Userspace Kernel Guest
Lightweight ExitHeavyweight Exit
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
KVM is a Virtualization Driver
KVM is a small kernel driver that adds virtualization support on multiple architectures– AMD, Intel (included in 2.6.20)
• KVM-lite: PV Linux guest on non-VTx / non-SVM host
– IA64 (included in 2.6.26)– S390 (included in 2.6.26)– Embedded PowerPC (power.org, included in 2.6.26)
About 30k LOCS Compared to ~250k LOCS for Xen Uses QEMU in userspace as a device model Safe to use by unprivileged userspace processes Can leverage almost all Linux features
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
KVM Features Power Management
– C and P state support– Advanced governers– Suspend/resume
Memory Management– NUMA support
• Policy control• Memory migration
– Swapping– Overcommit– Compression (KSM)
Resource Control– cgroups– CFS tunables
Anything that Linux supports All Hardware that Linux supports is supported in KVM
– Compare this to ESX!
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
KVM Upstream Maintainerships
ReleasedHypervisor
Qemu DeviceModel
120 kloc
IBMAnthony Liguori
LibvirtCIM
Providers25 kloc
IBMKaitlin Rupert
KVM Modules15 kloc
Red HatAvi Kivity
Libvirt Toolchain100 kloc
Red HatDaniel Berrange
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
RHEL KVM Roadmap
RHEL 5.5 – Functions Nehalem-EX performance
optimizations Device model improvements More flexible Interrupt
handling PCI pass-through Bug fixes (vis a vis 5.4)
RHEL 6.0 – Functions Improved I/O throughput SR-IOV support Virtual Switch enhancements Improved RAS via Intel MCA Storage and Network
Management APIs
RHEL 6.x - Functions LPAR Mode Aptus support HA and node failover Clustered File System UEFI guest BIOS Westmere performance
optimizations
03/2010 2Q/2010 tbd
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
KSM (Kernel SamePage Merging)
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
KSM - Memory Page Sharing Implemented as loadable kernel module
– Kernel SamePage Merging (KSM) included in Linux Kernel 2.6.32 (Izik Eidus )– modprobe ksm– cat /sys/kernel/mm/KSM/max_kernel_pages (-> 2000)– cat /sys/kernel/mm/KSM/pages_sharing (>0)
Kernel scans memory of virtual machines– Looks for identical pages– Merges identical pages– Only stores one copy (read only) of shared memory– If a guest changes the page it gets it's own private copy
qemu-kvm KSM-patch added to kvm development tree after kvm-88 release
Significant hardware savings– Better consolidation ratio– Allows more virtual machines to run per host
• Memory Overcommit (avoiding Linux Swapping)• 600 VMs (web) on host with 48 cores and 256 GB RAM! (Red Hat claim)
http://www.linux-kvm.com/content/using-ksm-kernel-samepage-merging-kvm
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
IBM Linux Technology Center - KSM Recommendations
Include the virtio balloon driver and auto-ballooning daemon in RHEL 5.x guests. Tests show that ballooning is necessary for KVM over-commitment to be effective for Linux guests (1.7x).
Substitute a more recent 2.6.3x kernel for the default RHEL 5.x 2.6.18 kernel. These later kernels are significantly better at handling low-memory situations than the default kernel.
Decrease the frequency or randomize the runtime of periodic daemons. For example, the yum-updatesd daemon loads by default 1 hour after boot. When this daemon loads concurrently on over-committed guests, all the guests fault in the python runtime, causing a concurrent resource spike. By changing the setting to be more random, there would be no spike in memory usage.
Setting the swappiness tunable in the KVM host to zero significantly improves performance by causing the host to prefer evicting page cache pages before guest pages.
Turn zone reclaim off. Initial testing showed that memory pressure caused by over-commitment can trigger some unexpected performance reductions when specific NUMA memory zones become fully allocated.
Turn Numa off. Provisionally we see better results with numa turned off. We suspect there are some latent bugs in the numa allocation that we should find and fix.
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
VirtFS
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
VirtFS - Overview What is it?
– Filesystem pass-through mechanism between the KVM host and guest operating systems (para-virtualized file system)
– Uses Plan-9 Protocol (9P2000.L) between Client and Server• Simple, efficient protocol, maintained by IBM
– Server is on Host and is part of QEMU with VirtIO transport– Client is part of the Quest Kernel.
What are the expectations?– Provide secure and isolated Filesystem exports
between the KVM Host and Guest.– Close to native Filesystem (GPFS) performance– Multi-tenancy
Who Needs it?– VSC (Virtual Storage Cloud)
• SoNAS on top of VirtFS client in a KVM guest– SoNAS
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
VirtFS - Block Diagram
HARDWARE
HOST KERNEL
GPFS ClientVFS Interface
VirtFSServer
(v9fs server in QEMU)
HostUser Space
VirtIORing
Guest Kernel
VFS InterfaceVirtFS (v9fs)
Client
GPFS API
Apps on Guest
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
VirtFS and GPFS Why not use GPFS Directly on the Guest?
– GPFS limits the number of cluster cross mounts limiting the number of virtual machines.
– GPFS in the guests would be a significant resource usage (memory) due to fixed i-node cache allocation.
– GPFS is much more sensitive (narrow) about supported kernel versions and Operating Systems.
– Disk management becomes difficult on dynamically changing environment.
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
Virtio
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
Virtio First proposed by Rusty Russell
– Based on our experiences with Xen frontend/backend architecture Addressed a number of concerns:
– Clear separation between protocol and transport to allow multiple hypervisors to utilize
– Each component uses well defined interface and is replaceable– Minimum driver implementation required– Fits on top of existing hardware abstraction well (PCI)
Linux will support lguest, KVM, Xen, KVM-lite, PHYP, VMware, Viridian, and possibly more
– If each has 4-5 PV drivers, that's 35 new drivers!– All drivers would be doing the same thing
virtio is an abstraction of the common mechanism of VMMs– A single driver could, with little modification, run on many different VMMs
Especially important for “small” drivers (entropy driver, CPU hotplug, ballooning, etc.)
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
Virtio Architecture
virtio
virtio-net
lguest virtiobackend
xen virtiobackend
kvm virtiobackend
virtio-videovirtio-blk virtio-9p
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
Virtio for KVM Paravirtualized Drivers for KVM/Linux
– virtio was chosen to be the main platform for IO virtualization in KVM – The idea behind it is to have a common framework for hypervisors for IO
virtualization (same in XEN)– network/block/balloon/PCI passthrough devices are supported for KVM– The host implementation is in userspace - qemu, so no driver is needed in
the host (but still has some perforance issues) Hardware assisted Virtualization
– Support for advanced hardware features for both KVM and Xen• VT-d for secure PCI Pass-thru on Intel platforms• IOMMU for secure PCI Pass-thru on AMD platforms• PCI Single-Root I/O Virtualization (SR-IOV)
– Delivers native I/O performance for network and block devices Support for Microsoft Windows Servers guests
– Paravirtualized drivers for network and disk (WHQL certified -> Enterprise Distros)– Microsoft SVVP Certification (-> Enterprise Distros)
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
Current State of Virtio Drivers:
− virtio-net− virtio-blk− virtio-console− virtio-balloon− virtio-random
Transports:− virtio-pci− virtio-s390− virtio-lguest
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
Virtio-net
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
virtio-net Where most of the work is happening these days Current performance is in most cases, better than Xen
netfront/netback− Xen suffers from asymmetric RX/TX performance− KVM maintains symmetry on both
The mainline bits are still only roughly 50% of native− Active work underway to improve that further
Uses the tun/tap device− Added GSO support to improve performance
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
KVM netperf with virtio-net
© 2009 Chris Wright
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
KVM netperf with Device Assignement
© 2009 Chris Wright
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
I/O Acceleration withvhost-net & SRIOV
Thanks to Ram Pai (IBM)
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
Virtualization Phase I - Qemu Emulation
QEMU
Guest OS
Emulation entirely done in user space Qemu emulated hardware
– E1000/RTL nic– IDE driver
No hardware support No Host OS support Qemu/Guest is just a user process for the Host OS Performance – VERY SLOW
Virtual NIC
NIC Driver Disk Driver
Host OS
Virtual Disk
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
Virtualization Phase II - KVM Acceleration
Disk Driver
Qemu emulates devices and runs in User Mode Guest still part of the QEMU process Guest image run in Guest Mode facilitated by KVM KVM exploits Intel VT / AMD-V CPU support Performance
– Guest CPU speed is near native– IO is slow
Guest->Host->User mode and vice-versa context switch penalty for each i/o operation
KVM
NIC Driver
Virtual NIC
QEMU Guest OS
Host OS
Virtual Disk
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
Virtualization Phase III - I/O Acceleration through Virtio
NICDriver
Qemu emulated virtio device Guest runs a paravirt virtio driver I/O is buffered in circular send and receive queue Context switch from Guest->Host->User and vice/versa reduced significantly Performance
– Guest CPU speed is near native– Better I/O throughput and lower latency at lower CPU utilization
KVM Disk Driver
Send Q
Recv Q
Virtio Driver
QEMU Guest OS
Virtual Disk
Virtual NIC
Host OS
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
Virtualization Phase IV - I/O Accel. through vhost-net
Disk Driver NIC Driver
Kernel emulated virtio device through vhost-net Guest runs a paravirt virtio driver I/O is buffered in circular send and receive queue Context switch from kernel to user and vice-versa
eliminated per vmexit. Performance
– Guest CPU speed is near native– Further lower latencies
KVM
Send Q
Recv Q
Virtio Driver
vhost-net
QEMU Guest OS
Host OS
Virtual Disk
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
SRIOV Overview SRIOV : Single Root I/O Virtualization PCI-SIG Standard Ability to drive a PCIe function from multiple independent software entities Each software entity believes it has exclusive access Provides high throughput, low CPU utilization, high scalability Requires platform support
Configuration resource PF 0
PF1
PF2
PCIe port
VF0 0VF0 1
VF02
VF0 3
VF0 n
.. VF0 0
VF1 1
VF1 2
VF1 3
VF1 n
..
VF2 0
VF2 1
VF2 2
VF2 3
VF2 n
..
Replicates each hardware physical function into multiple virtual functions
– PF -> Physical Function– VF -> Virtual Function
Virtual Function is a replica of the Physical Function Each Physical Function can have up to 256 Virtual Functions Each Virtual Function can be driven by an independent software entity Virtual Functions are light weight which lack configuration resources.
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
Virtualization Phase V - SRIOV Hardware Acceleration
Guest OS1
KVM
PF ... ...
PF driver
VF driver
Guest OS2
VF driver
Guest OS3
VF driver
Guest OSn
VF driver
Host configures and enables PF and all VF Host and Qemu are by-passed in the I/O path Each guest controls a VF function Side band communication path between PF and VF
– For communicating device information– Co-ordination between PF and VF on device reset
Native CPU speed Promises native I/O performance at negligible CPU overhead
VF1 VF2 VFn
Host OS
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
Issues with SRIOV
All guest pages have to be pinned– Cannot overcommit guest memory
Requires PCI pass-thru platform support Guest migration across hosts is challenging
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
TCP Bandwidth Comparison
SRIOV provides higher bandwidth at lower CPU utilization
• 3400M2 Nehalem, 16cpu, 4G memory• Intel 1G SRIOV adapter• rhel5u4 host running rhel5u4 guest
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
TCP Latency Comparison SRIOV surprisingly has higher latency
• 3400M2 Nehalem, 16cpu, 4G memory• Intel 1G SRIOV adapter• rhel5u4 host running rhel5u4 guest
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
How to use SRIOV Ensure SRIOV is supported in BIOS/UFI If kernel not enabled use command line workaround : pci=assign-busses Activate driver to enable VF: modprobe igb max_vfs=1 Check for the existence of VF lspci 07:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01) 08:10.0 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01) Enable the VF for pci-passthru pciid=$(lspci -n | grep $bus | awk '{print $3}' | sed -e 's/:/ /') echo -n $pciid > /sys/bus/pci/drivers/pci-stub/new_id echo -n 0000:$bus > /sys/bus/pci/devices/0000:$bus/driver/unbind echo -n 0000:$bus > /sys/bus/pci/drivers/pci-stub/bind. Pass the VF pciid to the guest qemu-kvm -pcidevice host=$bus Verify that the device is grabbed by the corresponding driver in the guest # ethtool -i eth0 driver: igbvf
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
QEMU
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
QEMU QEMU is a community-driven project
– No company has sponsored major portions of it's development
QEMU does a really amazing thing– Can emulate 9 target architectures on 13 host architectures!– Provides full system emulation supporting ~200 distinct devices– Very sophisticated and complete command line interface (CLI)– There are more than 90 options in the output of qemu-kvm --help
Is the basis of KVM, Xen HVM, and xVM Virtual Box– Every Open Source virtualization project uses QEMU
Userspace device model for KVM− Provides management interface− Provides device emulation− Provides paravirtual IO backends
libvirt communicates directly with QEMU libvirt-cim communicates with libvirt
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
Creating a virtual Disk with qemu-img # qemu-img create
– Options: format; filename; size; compression; encryption; base image– Formats: qcow/qcow2, Vmware, Virtual PC, Parallels, raw and 8 more– A base image is an existing virtual disk to use as the initial state for a
copy on write snapshot• No coordination so you should manually make it read only
– Example: # qemu-img create -f qcow2 /path/to/virtualdisk 6G # qemu-img info /path/to/virtualdisk
– Give information about the virtual disk, e.g. size (on-disk and virtual), format, snapshots
# qemu-img convert -f <fmt> /path/to/virtualdisk \ -O <fmt> /path/to/converteddisk– Convert the virtual disk format, e.g.:
• Virtual PC to qcow• Add compression or encryption
# qemu-img commit /path/to/virtualdisk– Write a virtual disk's changes to its base image
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
Windows 2003 Installation on Fedora 12 Install KVM package
# yum install qemu kvm Create an image file for the virtual hard disk
# qemu-img create -f qcow2 win2003-1.img 4G Start the virtual machine # qemu-kvm -no-acpi -m 384 -hda windows2003-1.img \
-cdrom w2k3.iso -boot d -smp 2-m = memory (in MB)-hda = first hard drive (many image file types supported)-cdrom = ISO Image or CD/DVD drive‐
-boot[a|c|d] = boot from Floppy (a), Hard disk (c) or CDROM (d)-smp = number of CPUs
Install Windows, download PV drivers to VM and restart with virtio network option # qemu-kvm -no-acpi -m 384 -boot c -smp 2 windows2003-1.img \
-net nic,model=virtio Install paravirtualized (PV) network drivers and reboot with virtio network option For paravirtualized Windows block device (driver installation and usage) check e.g. http://www.linux-kvm.com/content/redhat-54-windows-virtio-drivers-part-2-block-drivers
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
Result of Windows 2003 Installation
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
OpenSolaris 2008.11 on RHEL-5.3/5.4 (KVM)
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
KVM Example with more Networking Options # qemu-kvm -hda /path/to/virtualdisk \
-net nic,model=e1000,macaddr=ac:de:48:64:61:1c \ -net tap,script=no,ifname=kvmtap679a \ -m 512 -smp 2 -usb -localtime -name “MyVM”
Networking Options– User space stack [-net user,vlan=<n>]
• Port forwarding [-redir tcp|udp:host-port:guestIP:guest-port]
– Tap a host interface[-net tap,vlan=<n>,ifname=<tapname>,script=no]
– Socket (private shared network with another host)[-net socket,vlan=<n>,listen=:<port>][-net socket,vlan=<n>,connect=<host>:<port>]
– Multicast socket (shared network with multiple hosts)[-net socket,vlan=<n>,mcast=<addr>:<port>]
– VM with no NICs [-net none]
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
KVM Security
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
sVirt: Hardening Linux Virtualization with Mandatory Access Control sVirt: pluggable security design for libvirt
– supports MAC security schemes like SELinux, SMACK MAC policy enforced by host kernel Guests and resources uniquely labeled: svirt_t, virt_image_t, virt_content_t,… Coarse rules for all isolated guests applied to svirt_t For simple isolation: all accesses between different UUIDs are denied Current status
– Low-level libvirt integration done– Can launch labeled gues– Basic label support in virsh
Future enhancements– Different types of isolated guests: svirt_web_t– Virtual network security– Controlled flow between guests– Distributed guest security
Related work– Labeled NFS– Labeled Networking– XACE
Similar work– XSM (port of Flask to Xen)
http://selinuxproject.org/page/SVirt
XEN Vulnerabilityhttp://www.hacker-soft.net/Soft/Soft_13289.htm
© 2009 James Morris
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
sVirt Dynamic Labeling Generates a Random unused MCS (Multiple Category Security) label. Labels the image file/device: svirt_image_t:MCS1 Launches the image: svirt_t:MCS1 Labels R/O Content: virt_content_t:s0 Labels Shared R/W Content: svirt_t:s0 Labels image on completion: virt_image_t:s0
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
sVirt Static Label (Multi-Level Security) Administrator must specify image label svirt_t:TopSecret Launches the image: svirt_t:TopSecret Libvirt will NOT label any content. Administrator responsible for labeling content.
virt-manager with static SELinux labeling
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
Questions ?
© 2010 IBM Corporation
IBM Systems Technical University - Budapest, 3-6 May 2010 - xVI06 - Open Source Virtualization with KVM on IBM System x
TrademarksTrademarks
The following are trademarks of the International Business Machines Corporation in the United States and/or other countries. For a complete list of IBM Trademarks, see www.ibm.com/legal/copytrade.shtml: AS/400, DBE, e-business logo, ESCO, eServer, FICON, IBM, IBM Logo, iSeries, MVS, OS/390, pSeries, RS/6000, S/30, VM/ESA, VSE/ESA, Websphere, xSeries, z/OS, zSeries, z/VM
The following are trademarks or registered trademarks of other companies
Lotus, Notes, and Domino are trademarks or registered trademarks of Lotus Development CorporationJava and all Java-related trademarks and logos are trademarks of Sun Microsystems, Inc., in the United States and other countriesLINUX is a registered trademark of Linux TorvaldsUNIX is a registered trademark of The Open Group in the United States and other countries.Microsoft, Windows and Windows NT are registered trademarks of Microsoft Corporation.SET and Secure Electronic Transaction are trademarks owned by SET Secure Electronic Transaction LLC.Intel is a registered trademark of Intel Corporation* All other products may be trademarks or registered trademarks of their respective companies.
NOTES:
Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here.
IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.
All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions.
This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area.
All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.
Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.
Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography.
References in this document to IBM products or services do not imply that IBM intends to make them available in every country.
Any proposed use of claims in this presentation outside of the United States must be reviewed by local IBM country counsel prior to such use.
The information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice.
Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk.