Oscon 2012 : From Datacenter to the Cloud - Featuring Xen and XCP
-
Upload
the-linux-foundation -
Category
Technology
-
view
11.443 -
download
4
description
Transcript of Oscon 2012 : From Datacenter to the Cloud - Featuring Xen and XCP
OSCON: From the Datacenter to the Cloud
Patrick F. WilburPFW Research
LLC
Josh WestOne.com
Steve MarescaZentific LLC
George DunlapXen.org
Featuring Xen and XCP
Schedule
● Unit 1: 09:00 - 09:45 Introducing Xen and XCP● Unit 2: 09:50 - 10:45 Devops ● Break: 10:45 - 11:00 ● Unit 3: 11:00 - 11:55 XCP in the Enterprise● Unit 4: 12:00 - 12:30 Future of Xen
Unit 1Introducing Xen and XCP
Unit 1 Overview
● Introduction & Xen vs. Xen Cloud Platform ● Xen/XCP Installation & Configuration ● XCP Concepts: pools, hosts, storage, networks, VMs
Introduction & Xen vs. Xen Cloud Platform
Xen, XCP, Project Kronos
Types of Virtualization
● EmulationFully-emulate the underlying hardware architecture
● Full virtualizationSimulate the base hardware architecture
● ParavirtualizationAbstract the base architecture
● OS-level virtualizationShared kernel (and architecture), separate user spaces
Types of Virtualization
● EmulationFully-emulate the underlying hardware architecture
● Full virtualization - Xen does this!Simulate the base hardware architecture
● Paravirtualization - Xen does this!Abstract the base architecture
● OS-level virtualizationShared kernel (and architecture), separate user spaces
What is Xen?
● Xen is a virtualization system supporting both paravirtualization and hardware-assisted full virtualization
● Initially created by University of Cambridge Computer
Laboratory ● Open source (licensed under GPL)
What is Xen Cloud Platform (XCP)?
● Xen Cloud Platform (XCP) is a turnkey virtualization solution that provides out-of-the-box virtualization/cloud computing
● XCP includes: ○ Open-source Xen hypervisor○ Enterprise-level XenAPI (XAPI) mgmt. tool stack○ Support for Open vSwitch (open-source, standards-
compliant virtual switch)
What is Project Kronos?
● Port of XCP's XenAPI toolstack to Deb & Ubuntu dom0 ● Gives users the ability to install Debian or Ubuntu, then
apt-get install xcp-xapi ● Provides Xen users with the option of using the same API
and toolstack that XCP and XenServer provide ● Early adopters can try new changes to XenAPI before they
get released in mainstream XCP & XenServer versions
Case for Virtualization
● Enterprise:○ Rapid provisioning, recovery○ Portability across pools of resources○ Reduced phy resource usage = reduced costs
● Small business:
○ Rapid provisioning, recovery○ Virt resources replace lack of phy res. to begin with!
Who Uses Xen?
● Debian Popularity Contest:○ 3x more people have Xen vs. KVM installed○ 3x more people have used Xen in the last 30 days
compared to KVM○ 19% of Debian users have Xen installed & 9% used it
in last 30 days - how many Debian users exist? ● ~12% of Ubuntu Server users use Xen as a host● Millions of users from a source that can't be named
... How many total users do you guess?
Who Uses Xen?
Believed to be at least 10-12 MILLION open-source Xen users!
(According to conservative assumptions about big distros and information we know)
Of course:● Overall Xen hosts must be much higher - 1/2 Million Xen
hosts at Amazon alone ● Number likely to be much higher considering commercial
products & Xen clones (client virt., EmbeddedXen, etc.)
Xen, XCP, and Various Toolstack Users
Who Uses Xen?
Some sources for reference:
● http://popcon.debian.org ● http://www.zdnet.com/blog/open-source/amazon-ec2-
cloud-is-made-up-of-almost-half-a-million-linux-servers/10620
● http://www.gartner.com/technology/reprints.do?id=1-1AVRXJO&ct=120612&st=sb
Type 2 versus Type 1 Hypervisor
Host OS
Type 2Hypervisor ?
Guest OSes
PC
Type 2 versus Type 1 Hypervisor
Type 1 Hypervisor
(Xen)Host OS
Type 2Hypervisor ?
Guest OSes
PCPC
Guest OSes
Security in Xen
● True Type 1 hypervisor:○ Reduced size trusted computing base (TCB) ○ Versatile Dom0 (Linux, BSD, Solaris all possible) ○ Dom0 disaggregation (storage domains, stub domains,
restartable management domain) ○ Inherent separation between VMs & system resources
● Best security, isolation, performance, scalability mix
The Case for Xen
● Xen is mature ● Open source (even XenAPI) ● XenAPI is better than libvirt, especially for enterprise
use*
* Detailed by Ewan Mellor: http://wiki.openstack.org/XenAPI
The Case for Xen
● Proven enterprise use (Citrix XenServer, Oracle VM, etc.)
● Hypervisor of choice for cloud
(Amazon, Rackspace, Linode, Google, etc.) ● Hypervisor of choice for client
(XenClient, Virtual Computer's NxTop, Qubes OS, etc.)
So, Why Xen?
● Open source ● Proven to be versatile ● Amazing community ● Great momentum in various directions
Xen Definitions
● Xen provides a virtual machine monitor (or hypervisor), which a physical machine runs to manage virtual machines
● There exist one or more virtual machines (or domains)
running beneath the hypervisor ● The management virtual machine (called Domain0 or
dom0) interacts with the hypervisor & runs device drivers ● Other virtual machines are called guests (guest domains)
Virtualization in Xen
Paravirtualization: ● Uses a modified Linux kernel ● Front-end and back-end virtual device model ● Cannot run Windows● Guest "knows" it's a VM and cooperates with hypervisor Hardware-assisted full virtualization (HVM): ● Uses the same, normal, OS kernel● Guest contains grub and kernel ● Normal device drivers● Can run Windows● Guest doesn't "know" it's a VM, so hardware manages it
Virtualization in Xen
Paravirtualization: ● High performance (claim to fame)● High scalability● Runs a modified operating system Hardware-assisted full virtualization (HVM): ● "Co-evolution" of hardware & software on x86 arch● Uses an unmodified operating system
Xen: Hypervisor Role
● Thin, privileged abstraction layer between the hardware and operating systems
● Defines the virtual machine that guest domains see instead
of physical hardware:○ Grants portions of physical resources to each guest○ Exports simplified devices to guests○ Enforces isolation among guests
Xen: Domain0 (dom0) Role
● Creates and manages guest VMsxl (Xen management tool)
A client application to send commands to Xen, replaces xm ● Supplies device and I/O services:
○ Runs (backend) device drivers○ Provides domain storage
Normal Linux Boot Process
BIOS
GRUB
Linux
Master Boot Record (MBR)
Kernel module
The Xen Boot Process
GRUB starts
Hypervisor starts
Domain0 starts
Guest domain starts
Guest OS boots
Kernel
Module
xl command
Guest Relocation (Migration) in Xen
● Cold Relocation
● Warm Migration
● Live Migration
Cold Relocation
Motivation:Moving guest between hosts without shared storage or with different architectures or hypervisor versions
Process:
1. Shut down a guest on the source host2. Move the guest from one Domain0's file system to
another's by manually copying the guest's disk image and configuration files
3. Start the guest on the destination host
Cold Relocation
Benefits:● Hardware maintenance with less downtime● Shared storage not required● Domain0s can be different ● Multiple copies and duplications
Limitation:● More manual process● Service will be down during copy
Warm Migration
Motivation:Move a guest between hosts when uptime is not critical
Result:1. Pauses a guest's execution2. Transfers guest's state across network to a new host3. Resumes guest's execution on destination host
Warm Migration
Benefits:● Guest and processes remains running● Less data transfer than live migration Limitations:● For a short time, the guest is not externally accessible ● Requires shared storage ● Network connections to and from guest are interrupted and
will probably timeout
Live Migration
Motivation:Load balancing, hardware maintenance, and power management
Result:
1. Begins transferring guest's state to new host2. Repeatedly copies dirtied guest memory (due to
continued execution) until complete3. Re-routes network connections, and guest continues
executing with execution and network uninterrupted
Live Migration
Benefits: ● No downtime ● Network connections to and from guest remain active and
uninterrupted● Guest and its services remain available Limitations:● Requires shared storage● Hosts must be on the same layer 2 network● Sufficient spare resources needed on target machine● Hosts must be configured similarly
What's New in Xen 4.0+?
● Better performance and scalability● blktap2 for virtual hard drive image support (snapshots,
cloning)● Improved IOMMU PCI passthru● VGA primary graphics card GPU passthru for HVM
guests● Memory page sharing (Copy-on-Write) between VMs● Online resize of guest disks
What's New in Xen 4.0+?
● Remus Fault Tolerance (live VM synchronization)● Physical CPU/memory hotplug● libxenlight (libxl) replaces xend● PV-USB passthru● WHQL-certified Windows PV drivers (included in XCP)
What's New in XCP 1.5?
● Internal improvements (Xen 4.1, smaller dom0)
● GPU pass through (for VMs serving high end graphics)
● Performance and scalability (1 TB mem/host, 16 VCPUs/VM, 128 GB/VM)
● Networking (Open vSwitch backend, Active-Backup NIC Bonding)
● More guest OS templates
XCP 1.6 (available Sept/Oct '12)
● Xen 4.1.2, CentOS 5.7 w/ 2.6.32.43, Open vSwitch 1.4.1● New format Windows drivers, installable by Windows
Update Service● Net: Better VLAN scalability, LACP bonding, IPv6 ● More guest OS templates: Ubuntu Precise 12.04, RHEL,
CentOS, Oracle Enterprise Linux 6.1 & 6.2, Windows 8 ● Storage XenMotion:
○ Migrate VMs between hosts/pools w/o shared storage○ Move a VM’s disks between storage repositories while
VM is running
Xen/Xen Cloud Platform Installation, Configuration
Xen Light, XCP Installer
Installing Xen
Xen installation instructions, including from source: http://wiki.xen.org/wiki/Xen_Overview
1. Install Linux distro2. Install Xen hypervisor package3. Install a dom0 kernel (pkgs available for many distros)4. Modify GRUB config to boot Xen hypervisor instead Result: A working Xen hypervisor and "Xen Light" installation
Installing XCP
1. Download latest XCP ISO: http://xen.org/download/xcp/index.html 2. Boot from ISO and proceed through XCP installer
Result: A ready-to-go Xen hypervisor, dom0, XAPI
Xen Cloud PlatformConcepts
Pools, hosts, storage, networks, VMs
Xen Cloud Platform (XCP)
● XCP was originally derived from Citrix XenServer (a free enterprise product), is open-source, and is free
● XCP promises to contain cutting-edge features that will
drive future developments of Citrix XenServer
Xen Cloud Platform (XCP)
● Again, XCP includes:○ Open-source Xen hypervisor○ Enterprise-level XenAPI (XAPI) management tool
stack○ Support for Open vSwitch (open-source, standards-
compliant virtual switch)
XCP Features
● Fully-signed Windows PV drivers ● Heterogeneous machine resource pool support ● Installation by templates for many different guest OSes
XCP XenAPI Mgmt Tool Stack
● VM lifecycle: live snapshots, checkpoint, migration● Resource pools: live relocation, auto configuration,
disaster recovery● Flexible storage, networking, and power management● Event tracking: progress, notification● Upgrade and patching capabilities● Real-time performance monitoring and alerting
XCP's xsconsole (SSH or Local)
XCP Command Line Interface
# xe template-list (or # xe vm-import filename=lenny.xva )
# xe vm-install template=<template> new-name-label=<name>
# xe vm-param-set uuid=<uuid of new VM> other-config:install-repository=http://ftp.debian.org/
# xe network-list
# xe vif-create network-uuid=<network uuid from above> vm-uuid=<uuid of new VM> device=0
# xe vm-start vm=<name of VM>
Further Information
● http://pdub.net/2011/12/03/howto-install-xcp-in-kvm/
Unit 2: Nuts and Bolts
Steve Maresca● Wearer of many hats
○ Security analyst at a top 20 public univ in the Northeast○ Developer for the Zentific virtualization management
suite Zentific with a team of developersInvolved in the Xen world since 2005
Steve Maresca● Why do I use Xen?
○ Original impetus: malware/rootkit research○ Mature research community built around Xen○ Flexibility of the architecture and codebase permits
infinite variation○ Using it today for infrastructure as well as continuing
with security research■ LibVMI, introspection
Unit 2: Overview
● Structure of this presentation follows the general path we take while mentally approaching virtualization
○ Start simple, increase in level of sophistication● Overall flow:
○ Why Virtualization?○ XCP Deployment○ Management○ VM Deployment○ Monitoring○ Advanced Monitoring and Automation○ Best Practices
Why virtualization?
● We're all familiar with the benefits○ When the power bill drops by 25% and the server room is
ten degrees cooler, everyone wins● Bottom line: more efficient resource utilization
○ Requires proper planning and resource allocation○ Every industry publication technical and otherwise has made
'cloud' a household term○ Expectations set high, then reality arrives with different
opinions
Why virtualization?
● Many of us will have or have had difficulty making the leap○ Growing pains: shared resources of virtualization hardware
stretched thin○ Recognition that it requires both capital and staffing
investment● Certainly, you CAN use virtualization with traditional approaches
used with real hardware○ E.g.: VM creation wizard. upload ISO. attach iso, boot,
install, configure. repeat.■ almost everyone does this
○ Without much effort, you have consolidated 10 boxes into one or two; many organizations find success at this scale
● ..but: we have much more flexibility at our disposal; use it!
Why virtualization?
● Virtualization provides the tools to avoid the endless parade of one-off installations and software deployments
● Repeatable and measurable efficiency is attainable○ Why install apache 25 times when one well-tuned
configuration meets your needs?
Unit 2: Nuts and Bolts
Deployment Methodologies for Infrastructure and Virtual
Machines
Existing deployment methods
● Traditional deployment method: install from CD○ still works for virtualization and new XCP hosts○ If installing for the first time, this is the simplest way to
get your feet wet○ ISOs available at xen.org○ For deploying 5-10 systems, this method is manageable○ Don't fix what isn't broken: if it works for you, go for it○ For deploying 10-50 systems, this hurts
● We've all installed from CD/DVD a thousand times○ That's probably 950 times too many○ But..there are alternatives, and better ones at that
● XCP can be installed on a standard linux system thanks to Project Kronos○ apt-get install xcp-xapi○ Patrick discussed this earlier
● XCP can be installed via more advanced means● Virtual machines can be deployed via templates and clones
○ Golden images○ Snapshots○ Linked clones ○ Templates○ These methods are here to stay
Existing deployment methods
Preboot Execution Environment (PXE)
● Extraordinarily convenient mechanism to leverage network infrastructure to deploy client devices, often lacking any local disk
● Uses DHCP, TFTP; often uses NFS/HTTP after initial bootstrap ● Intel and partners produced spec in 1999
Preboot Execution Environment (PXE)
● Most commonly encountered over the years for:○ a remote firmware update tool○ thin-client remote boot○ LSTP Linux terminal server project○ Windows Deployment Services (Remote Installation Services)○ Option ROMs on NICs
● Lightly used in many regards, foreign to many● By no means a dead technology
Preboot Execution Environment (PXE)
● To facilitate PXE○ early in its boot process, a PXE-capable device emits a DHCP
request ○ This a DHCP request is answered with extra fields indicating
a PXE environment is available (typically, this is the 'next-server' option pointing the DHCP client at an adjacent TFTP server for the next steps)■ PXE-unaware clients requesting an IP ignore the extra data
○ the DHCP client, having obtained an IP, obtains a small bootloader from the TFTP server
○ Additionally, a configuration file is downloaded with boot information (location of kernel, command line, etc)
Deployment VLANProduction VLAN
New VM New VM
Network switches and routers
DHCP TFTP WDS
PXE Architecture
PXE Architecture: Components
● DHCP○ ISC-DHCP, Windows, almost anything works..
● TFTPd○ TFTP is an extraordinarily simple protocol, so..○ If it is a TFTP server, it's perfect
● Windows Deployment Services ● HTTP or FTP
○ Apache, nginx, lighttpd, IIS, a bash script, ..○ Optional, but very useful for serving scripts,
configuration files, etc ● Roll your own on one server with very modest resources
PXE Architecture: Components
● Purpose-built solutions ○ Cobbler
■ Fedora project, Red Hat supported■ Supports KVM, Xen, VMware
○ LTSP (Linux Terminal Server Project)
○ Windows Deployment Services
○ FOG Project
So what does PXE buy us?
● Near zero-footprint deployment model● Leverages services you almost certainly already have in
place● Guaranteed reproducible deployments ● Agnostic relative to Virtual/Physical, OS ● Goes where a no USB key or optical drive is even in
existence
Requirements for deployment via PXE
● Server requires a NIC with a PXE ROM available● NIC Enabled for booting ● Very nice if you're using a blade chassis or ILO; easy to
reconfigure on the fly● Requires an answer file prepped for the host● Configured DHCP server● Configured TFTP server
Mechanisms for automated install
● General concept is often called an "answer file"○ Some file with a list of instructions is delivered to the
OS installer with device configuration info, a list of packages to install, possibly including custom scripts, etc.
● Linux○ Centos/RHEL: kickstart○ Debian/Ubuntu: preseed (though kickstart files are
gaining popularity in the Debian world)● Windows
○ WAIK or Windows Automated Installation Kit
Example infrastructure setup
● Debian as the base OS
● ISC-DHCP as a means of advertising next-server DHCP option
● tftpd-hpa for a tftp daemon
● also running Apache for serving scripts and a variety of other files as installation helpers
Our configuration: ISC-DHCPshared-network INSTALL { subnet 192.168.2.0 netmask 255.255.255.0 { option routers 192.168.2.1; range 192.168.2.2 192.168.2.254; allow booting; allow bootp; option domain-name "zentific"; option subnet-mask 255.255.255.0; option broadcast-address 192.168.2.255; option domain-name-servers 4.2.2.1; option routers 192.168.2.1; next-server 192.168.2.1; filename "pxelinux.0"; }}
Deploying XCP via PXE
● Requires an "answer file" to configure the XCP system in an unattended fashion
● Also leverages HTTP to host the answer file and some installation media
● TFTP serves a pxeconfig referencing the answer file and providing basic configuration for the installer (console string, minimum RAM, etc)
Deploying XCP via PXE: pxeconfig
DEFAULT xcpLABEL xcpkernel mboot.c32append /xcp/xen.gz dom0_max_vcpus=2 dom0_mem=2048M com1=115200,8n1 console=com1 --- /xcp/vmlinuz xencons=hvc console=hvc0 console=tty0 answerfile=http://192.168.2.1/xcp_install/xcp_install_answerfile install --- /xcp/install.img
Deploying XCP via PXE:answerfile<?xml version="1.0"?> <installation> <primary-disk>sda</primary-disk> <keymap>us</keymap> <root-password>pandas</root-password> <source type="url">http://192.168.2.1/xcp_install</source> <post-install-script type="url" stage="filesystem-populated"> http://192.168.2.1/xcp_install/post.sh </post-install-script> <admin-interface name="eth0" proto="static"> <ip>192.168.2.172</ip> <subnet-mask>255.255.255.0</subnet-mask> <gateway>192.168.2.1</gateway> </admin-interface> <nameserver>4.2.2.1</nameserver> <timezone>America/New_York</timezone> </installation>
Deploying XCP via PXE
Deploying XCP via PXE
Deploying XCP via PXE
Deploying XCP via PXE
Deploying XCP via PXE
Deploying XCP via PXE
Deploying XCP via PXE
Deploying XCP via PXE
Deploying XCP via PXE, complete
Unit 2: Nuts and Bolts
Deployment Methodologies for Virtual Machines
● Again, traditional methods○ VM creation wizard. upload ISO. attach iso, boot,
install, configure. repeat.○ almost everyone does this
● Virtual machines can be deployed via templates and clones○ Golden images○ Snapshots○ Linked clones ○ Templates○ These methods are here to stay
Existing deployment methods
● XCP makes deployment of VMs simple○ templates:
# xe template-list | grep name-label | wc -l84
○ clones: xe vm-clone● Virtual machines can be deployed via templates and clones
○ Golden images○ Snapshots○ Linked clones ○ Templates○ These methods are here to stay
Existing deployment methods
Deploying Centos via PXE
● Customization via Kickstart
● Anaconda installer uses "one binary to rule them all" so customization at installation time is more restrictive than other distributions
● Standard pxeconfig
Deploying Centos : PXE config
SERIAL 0 115200CONSOLE 0DEFAULT centos_5.6_x86_64_installLABEL centos_5.6_x86_64_installkernel centos/5.6/x86_64/vmlinuzappend vga=normal console=tty initrd=centos/5.6/x86_64/initrd.img syslog=192.168.1.2 loglevel=debug ksdevice=eth0 ks=http://192.168.2.1/centos-minimal.ks --PROMPT 0TIMEOUT 0
Deploying Centos : Kickstartinstalltextlang en_US.UTF-8key --skipskipxlogging --host=192.168.1.125network --device eth0 --bootproto dhcpurl --url http://mirrors.greenmountainaccess.net/centos/5/os/x86_64rootpw --iscrypted $1$j/VY6xJ6$xxxxxxxxxfirewall --enabled --port=22:tcpauthconfig --enableshadow --enablemd5selinux --enforcingtimezone --utc America/New_Yorkzerombrbootloader --location=mbr --driveorder=hda clearpart --initlabel --allautopartreboot
Deploying Centos : Kickstart
● Make a new VM using the "other" template○ # SRDISKUUID refers to the identifer for the storage repository ID○ xe vm-install new-name-label=$VMNAME sr-uuid=$SRDISKUUID
template="Other install media"● Set boot orderBoot order: DVD, Network, Hard-Drive
● xe vm-param-set uuid=$VMUUID HVM-boot-params:order="ndc"
●
Deploying Centos via PXE
Unit 2: Nuts and Bolts
XCP: Modifying the OSJust a quick comment
Installing softwareOr, Reminding XCP of its Linux Heritage
● XCP is by no means a black box, forever sealed away● It's only lightly locked down and easy to modify
○ Take care, it's not designed for significant upheaval○ Very convenient to install utilities, SNMP, etc
● Just: yum --disablerepo=citrix --enablerepo=base install screen
● Helps a lot with additional monitoring utilities
Unit 2: Nuts and Bolts
Monitoring and Automation
XCP Event Publisher (XAPI)
VM
AMQP or IF-MAP or 0MQ ..
IDS Firewall Middleware
Automation and response
VM
VM Adaptive feedback loop
Exploring the XCP API
What it is
● The XCP API is the backbone of the platform○ Provides the glue between components○ Is the backend for all management applications
● Call it XAPI or XenAPI○ occasionally when searching, XAPI can be a bit better
to differentiate from earlier work in traditional open source xen deployment
● It's a XML-RPC style API, served via HTTPS○ provided by a service on every XCP dom0 host
What it is
● API bindings are available for many languages○ .NET○ Java○ C○ Powershell○ Python
● Documentation available via the Citrix Developers' Network (in this regard, XCP==Xenserver)○ http://docs.vmd.citrix.com/XenServer/6.0.0/1.
0/en_gb/api/ ○ http://community.citrix.
com/display/xs/Introduction+to+XenServer+XAPI
What it is
● Official API bindings not available for your language of choice? No problem
● Protocol choice of XML-RPC means that most languages can support the API natively
● Ease of integration is superb. Here's an example using python (but ignoring the official bindings)
What it is
import xmlrpclibx=xmlrpclib.Server("https://localhost")sessid=x.session.login_with_password("root","pass")['Value']# go forth, that's all you needed to begin allvms=x.VM.get_all_records(sessid)['Value']
What it is
● xapi is available on for use on any xenserver or xcp system
● In addition as mentioned in our opening segment, XAPI is accessible via the kronos project Ubuntu/Debian systems
What XAPI isn't
● Not exactly 1:1 with the xe commands from the XCP command line○ significant overlap, but not exact
● NOT an inflexible beast like some APIs○ can be extended via plugins○ and (of course) it is open source if you want to get
your hands dirty ■ LGPL 2.1
Comparisons to other APIs in the virtualization space
● Generally speaking○ XAPI is well-designed and well-executed○ XAPI makes it pleasantly easy to achieve quick
productivity○ Some SOAPy lovers of big XML envelopes and
WSDLs scoff at XML-RPC, but it certainly gets the job done with few complaints
Comparisons to other APIs in the virtualization space
● Amazon EC2○ greater "surface area" than amazon EC2, which is a
classic example of doing a lot with rather a little API○ in particular, XAPI brings you closer to the virtual
machine and underlying infrastructure than EC2○ XAPI provides considerable introspection into the
virtual machine itself■ data reported by xen-aware tools within the guest is
reported as part of VM metrics■ Data can be injected into VM using the xenstore
Comparisons to other APIs in the virtualization space
● Oracle VM (also xen based)○ similar heritage; derives partly from the traditional
XenAPI of which XAPI is a distant relative○ generally speaking, the oracle VM api is on-par for
typically needed features, but XAPI is more powerful (e.g., networking capabilities)
Comparisons to other APIs in the virtualization space
● VMware○ XAPI is far more tightly constructed than VMWare's
huge (very capable, impressive) API○ By nature of protocol construction, XAPI is XML-RPC
vs heavier VMWare SOAP API. Measurably lower bandwidth requirements, parsing overhead.
○ VMware's API has a distinct feel of organic growth ( "one of these things is not like the other" is a common tune whistled while working with it
○ Speaking from a personal developer standpoint, sanity with XAPI in comparison is much higher. (We, Zentific, have worked very closely with both APIs)
API Architecture
API Architecture: General shape and form
● All elements on the diagram just shown are called classes
● Note: The diagram omits another twenty or more minor classes○ Visit the SDK documentation for documentation of all
classes
● Classes are the objects XCP knows about and exposes through API bindings
● Each class has attributes called fields and functions called messages. We'll stick with 'attributes' and 'functions.'
API Architecture: General shape and form
● Class attributes can be read-only or read-write
● All class attributes are exposed via setter and accessor functions○ e.g. for a class named C with attribute X: C.get_X○ There's a corresponding C.set_X too if the attribute
is read-write. Absent if read-only.○ For mapping type attributes, there are C.add_to_X
and C.remove_from_X for each key/pair
API Architecture: General shape and form
● Class functions are of two forms: implicit and explicit○ Implicit class functions include:
■ a constructor (typically named "create")■ a destructor (typically named "destroy")■ Class.get_by_name_label■ Class.get_by_uuid■ Class.get_record■ Class.get_all_records
○ Explicit class functions include every other documented function for the given class, which are generally quite specific to the intent of that class■ e.g. VM.start
API Architecture: General shape and form A note on UUIDs and OpaqueRefs● Multiple forms of unique identifier are used in XCP
○ Universally Unique Identifiers (UUIDs) ○ OpaqueRefs ○ Class-specific identifiers○ name-labels
● Both can be encountered in API calls and xe commands○ Conversion between UUIDs and OpaqueRefs will
be commonly required○ Parallel naming convention is acknowledged odd
consequence of development aiming at unique identifiers
○ General rule (feel free to break): if using API, use refs; if using xe, use UUIDs (per David Scott, Citrix)
API Architecture: Major Classes
● All major classes are shown in the inner circle of the API diagram○ VM: A virtual machine○ Host: A physical XCP host system○ SR: Storage repository○ VDI: Virtual disk image○ PBD: physical block device through which an SR is
accessed○ VDB: Virtual block device○ Network: A virtual network○ VIF: A virtual network interface○ PIF: A physical network interface
API Architecture: Minor Classes
● Minor classes are documented in the official Xenserver SDK documentation○ pool: XCP host pool information and actions○ event: Asynchronous event registrations○ task: Used to track asynchronous operations with a long
runtime○ session: API session management login, password
changes, etc
API Architecture: Linking Classes
● Linking classes are those that create a conceptual bridge between a virtual object and the underlying physical entity
○ VDI<>VBD<>VM■ VBD: Bridges the representation of a virtual machine's
internal disk with the actual disk image used to provide it
○ Network<>VIF<>VM■ VIF: Bridges the internal VM network interface with the
physical network to which it is ultimately plumbed
● When building complex objects, it's often necessary to build the linkages too, or failure will occur
API Architecture: Other Classes
● SM: storage manager plugin - for third-party storage integration (e.g. Openstack Glance)
● Tunnel: represents a tunnel interface between networks/hosts in a pool
● VLAN: assists in mapping a VLAN to a PIF, designating tagged/untagged interfaces. Each VLAN utilizes one PIF
API Architecture: Order of Operations
● Using a correct order of operations for API calls is important, though not particularly well documented
● Example: deleting a disk○ Resources must not be in use○ If deleting a VDI, make certain that no VBDs currently
reference it
● Generally, common sense dictates here in terms of the operations required
● When something is executed out of order, an exception is thrown
API Architecture: Target the right destination
● When running calls against a standalone xcp system, no need for extra consideration
● When running operations against a pool, it's necessary to target the pool master○ Otherwise an API exception will be thrown if you attempt to
initiate an action against a slave (type XenAPI.Failure if using the provided Python bindings)
● It's reasonably easy to code around this problem (the pool master may rotate, after all): http://community.citrix.com/display/xs/A+pool+checking+plugin+for+nagios
API Architecture: Target the right destination
import XenAPIhost="x"user="y"pass="p"try: session=XenAPI.Session('https://'+host) session.login_with_password(user, pass)except XenAPI.Failure, e: if e.details[0]=='HOST_IS_SLAVE': session=XenAPI.Session('https://'+e.details[1]) session.login_with_password(username, password) else: raises=session.xenapi
XAPI is Extensible: Plugins
● Extensible API via plugins○ These are scripts that you place in the XCP host.
■ Check out /etc/xapi.d/plugins/○ Can be invoked via the api
■ See host.call_plugin(...)● Affords huge flexibility for customization● Used today by projects like Openstack to provide greater
integration with XCP● Example code
○ http://bazaar.launchpad.net/~nova-core/nova/github/files/head:/plugins/xenserver/xenapi/etc/xapi.d/plugins/
○ https://github.com/xen-org/xen-api/blob/master/scripts/examples/python/XenAPIPlugin.py
Things to know● To access VM console, a valid session ID must be appended to
the request○ See http://foss-boss.blogspot.com/2010/01/taming-xen-cloud-
platform-consoles.html ● Metrics
○ ${class}_metrics are instantaneous values; this is an older XCP/Xenserver style of providing such data
○ Same metrics provided via RRD backend are historical and can show trending (rather than needing to aggressively poll for instantaneous values)
● It's possible to add xenstore values for a VM, enables an agent in VM to act upon that data○ consider: root password reset via xenstore; directed actions
Unit 2: Nuts and Bolts
Best Practices
Best Practices
These are primarily 'general' best practices
Common-sense best practices are especially critical for virtualization given:
● the sharing of scarce resources (and the complex interplay thereof when it comes to performance)
● Many eggs are in one basket: failures are felt very strongly
Best Practices: Less is more
● Often, fewer vcpus per VM are better○ Allocate only what's needed for the workload○ If unknown, begin with 1 VCPU and work up as
needed● Always account for the CPU needs of the hypervisor● Never allocate more VCPUs for a VM than the
number of available PCPUs (even if you “can”)● Great video by George Dunlap for more guidance :
http://www.citrix.com/tv/#videos/2930
Best Practices: Workload grouping
● Group VMs logically based upon expected (or observed) workload and behavior
○ Workloads which are randomly 'bursty' from an IO or CPU standpoint
○ Regularly scheduled workloads demanding high CPU when running:
○ interleave schedule if possible so each VM has the maximal share of resources
Best Practices: Workload separation
● Separate VMs logically based upon expected (or observed) workload and behavior
○ Workloads which always require the majority that the hardware can provide for performance (like an I/O bottleneck on the network when the pipe is only so wide)
○ Workloads like databases that can be heavy on memory utilization and bandwidth
Best Practices: Resource allocation
● If needed, guarantee resources for a workload○ grant higher scheduling priority○ VCPU pinning to physical cores○ Balloon VM in anticipating of memory usage, then return
memory to the pool● WARNING: use with caution
○ possible to reduce performance for adjacent workloads on the same host
○ possible to lock a VM to a host (migration becomes problematic)
Best Practices: Compartmentalize Risk
● Segregate VMs operating in distinct security domains○ a good practice no matter what the context ○ certainly your user-facing services don't need access to the
same network that allows switch/router management. Applies similarly to VMs
● Especially important if required by compliance/regulations○ Example: PCI-DSS (Payment Card Industry Data Security
Standard)■ https://www.pcisecuritystandards.
org/documents/Virtualization_InfoSupp_v2.pdf○ Example: DOD regulations regarding data classification and
separation of networks■ Crossing the streams causes total protonic reversal
Best Practices: Monitor your environment!
● Log aggregation AND analysis:○ if you don't know how to identify when a problem is
occurring, how can you circumvent/fix/prevent it?● Forecasting for the future● Virtual environments are dynamic enough that problems can
sneak up on you● If you have a head start on hardware failure, you can migrate
VMs from a failing host to a hot spare to enable repair/replacement (without downtime)
● Don't forget to monitor hardware temperature. HVAC failures are not much fun.
○ The virtual fallout can be enormous:○ high power density-->high heat takes out high-visibility, high
value resources by the dozen
● Knowing when to prefer real hardware over virtualization is as important as being able to recognize when virtualization will benefit
○ Virtualization is not a panacea● Problematic workloads
○ Highly parallel computations requiring many CPUs acting in concert
○ Heavy IO demands of network or storage○ Tasks which require exceptionally stable clocks
(nanosecond granularity)● But: technology is improving at breakneck speed
○ 10 gb Ethernet at line rate is possible for a virtual machine○ CPU improvements have improved or eliminated many
bottlenecks (clock stability is much better, for example)
Best Practices: When not to virtualize
Best Practices: Resource Modeling
● Build a simple model for your environment○ Try to do so before virtualizing a service and afterward, then
compare○ Helps with cost management and expenditure justification○ Measures success or failure of virtualization to solve a
problem● E.g. $x/gb of ram + $x/vcpu + $x/hr labor + $licensing/vm + VM
importance factor● Calculate worst case perspective for model and then graph
current state relative to that
OSCON: From the Datacenter to the Cloud - Featuring Xen and XCP
XCP in the EnterpriseJosh West
Table of Contents
● Introduction: XCP in the Enterprise
● Storage in Xen Cloud Platform
● Advanced Networking in Xen Cloud Platform
● Statistics & Monitoring in XCP
● Enterprise Cloud Orchestration
Introduction: XCP in the Enterprise
● Xen hypervisor has already been proven as a solid choice as platform for IT systems:
● Amazon
● Rackspace
● Oracle VM
● dom0 Mainline
● No need to run Xen on distribution flavor of choice and build from ground up, just for hosting IT business systems.
● Many choices (Vmware, RHEV, Oracle VM, Citrix XenServer).
So... Why use XCP?
● Excellent blend of enterprise quality code and next generation technologies.
● Developed by Citrix/XenSource.
● Enhanced by the open source community.
● Compatible with Citrix XenCenter for management.
● Rapid deployment:○ PXEBOOT○ Boot from SAN
XCP and Pools
● Pools allow you to combine multiple XCP hosts into one managed cluster.
○ Live migration.
○ Single API connection & management connection.
○ Single configuration.
○ Shared storage.
● Single master, multiple slaves.
XCP or Citrix XenServer?Citrix XenServer:
● Professional Support
● High Availability
● Advanced Storage
● Cloudstack & Openstack
● Benefits from XCP Community contributions
Xen Cloud Platform:
● Community Support
● DIY High Availability
● Standard Storage
● Cloudstack & Openstack
● Benefits from Citrix developers & codebase
DIY? Roll Your Own
● Still not convinced? See Project Kronos.
● Benefits of XAPI in a *.deb Package.
● Run on Debian or Ubuntu dom0 with Xen Hypervisor.
● http://wiki.xen.org/wiki/Project_Kronos
Enough Promo!
Let's see the cool stuff!
Storage in XCP
Storage in XCP
● Supports major storage technologies & protocols
● Local storage, for standalone & scratch VM's.
● Centralized storage, for live migration & scaling:
○ LVMoISCSI and LVMoFC and LVMoAOE
■ Software iSCSI Initiator■ HBA (Qlogic & Emulex)■ Coraid has drivers for AOE
○ VHD on NFS
Under the Hood: VHD
● VDI's are stored in Virtual Hard Disk (VHD) format.*
● From Microsoft! (Connectix), under Microsoft Open Specification Promise.
● Type's of VHD's:
○ Fixed hard disk image (Appliances).○ Dynamic hard disk image (XCP).○ Differencing hard disk image (Snapshots, Cloning).
● Tools from Microsoft & Virtualbox for working/converting.
Under the Hood: LVM on XCP
● LVM is used on all block storage in XCP.
● XCP organizes with a simple mapping:
○ Storage Repository (SR) = LVM Volume Group
○ Virtual Disk Image (VDI) = LVM Logical Volume
● Locking is not handled like cLVM.
● XCP Pool Master toggles access w/ lvchange -ay/an.
Under the Hood: LVM on XCP
Under the Hood: LVM on XCP
Under the Hood: LVM on XCP
● XCP uses VHD dynamic disk images on top of LVM.
● So we have VHDoLVMo(ISCSI|FC|AOE).
● And then all our VM's will probably use LVM:
● LVMoVHDoLVMo(ISCSI|FC|AOE). :-)
● VHD differencing disk images for VM/VDI snapshots, not LVM snapshots.○ Portable between Storage Repository types.○ No LVM snapshot performance issues.
Under the Hood: NFS on XCP
● NFSv3 w/ TCP is used for NFS based SR's.
● Mounted at /var/run/sr-mount/<SR UUID>/
● Mounted with 'sync' flag; no 'async' delayed operation as this would be unwise and unsafe for VM's.
● NFS lets you get closer to VHD's - they're stored as files.
● Perhaps could integrate better with your backup solution.
Under the Hood: NFS on XCP
● Choose NFS platform wisely for proper performance.
● Just a Linux box w/ NFS export not enough: ~32 MB/s.
● Need cache system on your NAS (e.g. NetApp PAM).
● DIY? Look into using SSD's or BBU NVRAM w/ Facebook's Flashcache or upcoming Bcache.
● Gluster has NFS server and Gluster is tunable.
XCP Storage: Which to Choose?
● All good choices. Depends on your shop & experience.
● If you have an enterprise NAS/SAN, use it!
○ Caching for performance.○ Enterprise support contracts.○ Alerting and monitoring.
● No budget? No space left? No problem. You can build your own SAN for use with XCP.
● Test labs, recycling equipment, PoC, and small production deployments.
DIY H.A./F.T. SAN for XCP
● Easy to build a storage system (that actually performs well) for use with XCP:
○ Highly Available / Fault Tolerant.○ Manageable / Not Too Complicated.
● XCP let's you connect to multiple SR's.
● If you outgrow your DIY SAN, or find it going from a test lab purpose to hosting production critical VM's, XCP will let you move VM's between SR's with ease.
● Just attach your expensive shiny SAN/NAS and move.
DIY H.A./F.T. SAN: What We'll Build
● Lightweight Linux-based, clustered SAN for XCP SR.
● Active/Standby with automatic failover & takeover.
● Synchronous storage replication between storage nodes.
● iSCSI presentation to XCP hosts.
● Built with two open source software projects:○ DRBD○ Pacemaker
TripAdvisor XCP + XSG Lab
● Built at TripAdvisor, with 19.33TB storage.
● Two Dell PowerEdge 1950's + Cisco 6513 Catalyst.
DIY H.A./F.T. SAN: Overview
XCP Storage Node 1
eth0 eth1
eth2 eth3
XCP Storage Node 2
eth0 eth1
eth2 eth3
Corosync / Pacemaker
DRBD
Stacked SwitchesStacked Switches
iSCSI iSCSI
Step 1: Hardware RAID
● Configure your hardware RAID controller.
● Use features such as Adaptive Read-Ahead and Write-Back, to enable caching.
● Battery backed up cache is important.
● Recommended: RAID 1, 5, or 6 for internal disks.
● Recommended: RAID 10, 50, or 60 for DAS shelves.
Step 2: ILO / DRAC / LOM
● Configure your dedicated ILO card.
● Using Dell Remote Access Controller (DRAC) in our example lab.
● Enable IPMI support. Needed for STONITH.
● Set & remember the credentials. Can test with ipmitool from external host.
● Dedicated NIC recommended!
Step 3: Install OS
● Install CentOS x86_64. Tested this with 5.8 & 6.0.
● Partition and configure accordingly.
● Leave space for attached storage.
● Partition the dedicated storage as LVM Physical Volume.
● Use gpartd if >2TB.
Step 4: Configure Networking
● Bond eth0 + eth1 front end interfaces w/ LACP (bond0).
● Crossover eth2 to eth2, eth3 to eth3 backend interfaces.
○ eth2: Dedicated for corosync + pacemaker.○ eth3: Dedicated for DRBD replication.
Storage Node 1 Storage Node 2Management bond0 192.168.0.10 192.168.0.11
Corosync + Pacemaker eth2 10.168.0.10 10.168.0.11
DRBD eth3 10.168.1.10 10.168.1.11
*Floating iSCSI IP 192.168.0.20
Step 4: Configure Networking
XCP Storage Node 1
eth0 eth1
eth2 eth3
XCP Storage Node 2
eth0 eth1
eth2 eth3
10.168.1.10 & 10.168.1.11
10.168.0.10 & 10.168.0.11
Stacked SwitchesStacked Switches
192.168.0.10192.168.0.11[ 192.168.0.20 ]
Step 5: Configure LVM
● Setup dedicated storage partition:
$ pvcreate /dev/sdb1 $ vgcreate vg-xcp /dev/sdb1 $ lvcreate -l 100%FREE -n lv-xcp vg-xcp
● Adjust /etc/lvm/lvm.conf filters and run vgscan:
filter = [ "a|sd.*|", "r|.*|" ]
● XCP will put LVM on top of iSCSI LUN's (LVMoISCSI).
● SAN should not scan local DRBD resource content.
Step 6: Install DRBD
● Latest stable... Constantly in motion.
$ yum install gcc kernel-devel rpm-build flex
● Fetch from http://oss.linbit.com/drbd/ (8.4.1)
$ mkdir -p ~/redhat/{RPMS,SRPMS,SPECS,SOURCE,BUILD}$ tar -xvzf drbd-8.4.1.tar.gz$ cd drbd-8.4.1$ make rpm km-rpm$ yum install /usr/src/redhat/RPMS/x86_64/drbd*.rpmor$ yum install ~/redhat/RPMS/x86_64/drbd*.rpm
Step 7: Configure DRBD
● Four major sections to adjust:
○ syncer { ... }○ net { ... }○ disk { ... }○ handlers { ... }
● See DRBD documentation for full details.
● http://www.drbd.org/docs/about
Step 7: global_common.confsyncer {
rate 1G;
verify-alg "crc32c";
al-extents 1087;
}
disk {
on-io-error detach;
fencing resource-only;
}
handlers {
... [ snip ] ...
fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";
... [ snip ] ...
}
net {
sndbuf-size 0;
max-buffers 8000;
max-epoch-size 8000;
unplug-watermark 8000;
}
Step 7: d_xcp.resresource d_xcp {
net {
allow-two-primaries;
}
on xsgnode1 {
device /dev/drbd0;
disk /dev/vg-xcp/lv-xcp;
address 10.168.1.10:7000;
meta-disk internal;
}
on xsgnode2 {
device /dev/drbd0;
disk /dev/vg-xcp/lv-xcp;
address 10.168.1.11:7000;
meta-disk internal;
}
}
Review
● Two servers with equal storage space.
● First two NIC's bonded to network.
● Third NIC crossover, dedicated for corosync/pacemaker.
● Fourth NIC crossover, dedicated for DRBD.
● We've setup LVM and then DRBD on top.
● Now time to cluster and present to XCP.
Step 8: Corosync + Pacemaker
● Install Yum repo's from EPEL + Clusterlabs
○ EPEL is needed on CentOS/RHEL 5 and 6○ Clusterlabs repo only needed on CentOS/RHEL 5○ Red Hat now includes pacemaker :-)○ http://fedoraproject.org/wiki/EPEL
● Installation & Configuration:
○ http://clusterlabs.org/wiki/Install○ $ yum install pacemaker.x86_64 heartbeat.x86_64 corosync.x86_64
iscsi-initiator-utils
○ http://clusterlabs.org/wiki/Initial_Configuration
Pacemaker Review
● Nodes
● Resource Agents
● Resources/Primitives
● Resource Groups
● CRM Shell
● Cluster Information Base
● Master/Slave Sets (MS)
● Constraints: Location
● Constratints: Colocation
● STONITH
Pacemaker CRM Shell
What Should Pacemaker Do?
● Manage floating IP address 192.168.0.20 - iSCSI target.
● Configure an iSCSI Target Daemon.
● Present an iSCSI LUN from iSCSI Target Daemon.
● Ensure DRBD is running, with Primary/Secondary.
● Ensure DRBD Primary is colocated with floating IP, iSCSI Target Daemon, and iSCSI LUN.
● Ordering: DRBD, iSCSI Target, iSCSI LUN, floating IP.
Step 9: Pacemaker Configuration
Unblock iSCSI Port
Floating IP
iSCSI LUN
iSCSI Target
Block iSCSI Port
DRBD Primary/Secondary
Sta
rt
Sto
p
Step 9: Pacemaker Configurationproperty $id="cib-bootstrap-options" \ dc-version="1.0.11-..." \ cluster-infrastructure="openais" \ expected-quorum-votes="2" \ no-quorum-policy="ignore" \ default-resource-stickiness="100" \ stonith-enabled="false" \ maintenance-mode="false" \ last-lrm-refresh="1311719446" \
rsc_defaults $id="rsc-options" \ resource-stickiness="100"
Step 9: Pacemaker Configurationprimitive res_ip_float ocf:heartbeat:IPaddr2 \
params ip="192.168.0.20" cidr_netmask="20" \
op monitor interval="10s"
primitive res_portblock_xcp_block ocf:heartbeat:portblock \
params action="block" portno="3260" ip="192.168.0.20" protocol="tcp"
primitive res_portblock_xcp_unblock ocf:heartbeat:portblock \
params action="unblock" portno="3260" ip="192.168.0.20" protocol="tcp"
primitive res_drbd_xcp ocf:linbit:drbd \
params drbd_resource="d_xcp"
ms ms_drbd_xcp res_drbd_xcp \
meta master-max="1" master-node-max="1" \
clone-max="2" clone-node-max="1" notify="true"
Step 9: Pacemaker Configurationprimitive res_target_xcp ocf:tripadvisor:iSCSITarget \
params implementation="tgt" tid="1" \
iqn="iqn.2011-12.com.example:storage.example.xsg" \
incoming_username="target_xcp" incoming_password="target_xcp" \
additional_parameters="MaxRecvDataSegmentLength=131072
MaxXmitDataSegmentLength=131072" \
op monitor interval="10s"
primitive res_lun_xcp_lun1 ocf:heartbeat:iSCSILogicalUnit \
params target_iqn="iqn.2011-12.com.example:storage.example.xsg" \
lun="1" \
path="/dev/drbd/by-res/d_xcp" scsi_id="xcp_1" \
op monitor interval="10s"
Step 9: Pacemaker Configurationgroup rg_xcp \
res_portblock_xcp_block \
res_target_xcp \
res_lun_xcp_lun1 \
res_ip_float \
res_portblock_xcp_unblock
colocation c_xcp_on_drbd inf: rg_xcp ms_drbd_xcp:Master
order o_drbd_before_xcp inf: ms_drbd_xcp:promote rg_xcp:start
Step 9: Pacemaker Configuration
Unblock iSCSI Port
Floating IP
iSCSI LUN
iSCSI Target
Block iSCSI Port
DRBD Primary/Secondary
Sta
rt
Sto
p
Step 10: STONITH Configurationprimitive stonith-xsgnode1 stonith:external/ipmi \
params hostname="xsgnode1.example.com" ipaddr="192.168.0.30" \
userid="root" passwd="shootme"
primitive stonith-xsgnode2 stonith:external/ipmi \
params hostname="xsgnode2.example.com" ipaddr="192.168.0.31" \
userid="root" passwd="shootme"
location loc_stonith_xsgnode1 stonith-xsg01n -inf: xsgnode1.example.com
location loc_stonith_xsgnode2 stonith-xsg02n -inf: xsgnode2.example.com
property stonith-enabled="true"
Step 11: Review Pacemaker
● Make sure resources are OK: crm status
● Make sure floating IP configured: ip addr
● Make sure DRBD primary/secondary: drbd-overview
● Make sure iSCSI LUN's presented: tgt-admin -s
Step 12: Connect SR in XCP!
XCP and High Availability
● We've just shown how to build a highly-available / fault-tolerant SAN, using DRBD and Pacemaker.
● EXT4oVHDoLVMoISCSIoDRBDoLVM :-)
● We did this on CentOS 5.x (and 6.x).
● XCP is based on CentOS 5.x.
● XCP can use Pacemaker for H.A.!
XCP Storage Future
● XCP 1.6 will support Storage XenMotion
○ Migration of VM's and their storage, live!○ Can evacuate a host with local SR attached VM's.
● Cluster Filesystems:
○ Citrix is looking into Gluster and Ceph.○ Gluster client builds and works on XCP 1.5b.○ Relatively easy for us to write a Gluster SR driver.○ Ceph integration is a bit trickier.
Advanced Networking with Xen Cloud Platform
Advanced Networking wtih XCP
● Bonding and VLAN's
● OpenvSwitch and OpenFlow
● Distributed Virtual Switch Controller
● GRE Tunnels & Private VM Networks
Advanced Networking with XCP
NIC Bonding review
● Means of combining multiple NIC's together for:
○ Failover○ Load Balancing○ More Bandwidth
● Available since Linux Kernel 2.0.x. Stable and proven.
● Many modes of bonding NIC's:
○ Active/Standby.○ Active/Active.
NIC Bonding Modes● mode = 1: active-backup
● mode = 2: balance-xor
● mode = 3: broadcast
● mode = 4: 802.3ad (LACP)
● mode = 5: balance-tlb
● mode = 6: balance-alb
● mode = 7: balance-slb
<--
<--
<--
XCP Bonding: Source Level Balancing
● XCP + XenServer introduce optimized bonding for virtualization.
● mode = 7, aka balance-slb.
● Derived from balance-alb.
● Spread VIF's across PIF's.
● Provides load balancing and failover.
● Active/Active.
XCP Bonding: Source Level Balancing
● New VIF source MAC's assigned a PIF w/ lowest util.
● Rebalances VIF's/MAC's across PIF's every 10 sec.
○ No GARP during rebalance necessary.○ Switch will see new traffic and update tables.○ Still need to connect PIF's to same/stacked switch.
● Up/Down delay of 31s/200ms.
● Failover on link down handled with GARP for fast updates.
XCP Bonding: Source Level Balancing
● Limitation: 16 unbonded NIC's or 8 bonded.
● Limitation: Only 2 NIC's per bond in XenCenter.
● Can override with xe command line:
● xe bond-create network-uuid=... pif-uuids=...,...,...
● Can override bonding mode if desired:
● xe pif-param-set uuid=<bond pif uuid> \ other-config:bond-mode=<active-backup, 802.3ad>
XCP VLAN's
● PIF but with a tag.
● Can apply to Ethernet NIC's and Bonds.
● xe vlan-create network-uuid=... pif-uuid=... tag=...
Traditional Advanced Networking
● Manual configuration process.
○ Bonding? /etc/modprobe.conf and ifenslave
○ Bridges? brctl from bridge-utils
○ Vlans? vconfig
○ GRE? IPSEC? QoS/Rate Limiting?
● Distribution specific configuration files.
Virtualization and Advanced Networking
● Virtualization brought network switching into the server itself.
● Systems & services no longer fixed.
● Nomadic... VM's move around w/o Network Admin knowing.
● SPAN ports for IDS? Netflow information for a specific VM? QoS and rate limiting? How is this handled?
OpenvSwitch
● Software switch like Cisco Nexus 1000V.
● Distribution agnostic. Plugs right into Linux kernel.
● Reuses existing Linux kernel network subsystems.
● Compatible with traditional userspace tools.
● Free and Open Source - hence the "open"... ;-)
● http://openvswitch.org/
Why use OpenvSwitch?
● Why use it in general?
● Why does XCP/XenServer use OpenvSwitch?
OpenvSwitch Centralized Management
● Software Defined Networking. Keep data plane, centralize control plane.
● Distributed Virtual Switch Controller (DVSC):
○ OpenFlow○ OVSDB Management Protocol
● Ensures sFLOW, QoS, SPAN, Security policies follow VM's as they move & migrate between XCP hosts.
● Citrix XenServer DVSC works with XCP.
Cross Server Private Networks
● Traditional Approach:
○ Use dedicated NIC's with separate switches.
○ Use a private dedicated non-routed VLAN.
● Management and scalability issues.
● Works for small deployments.
Cross Server Private Networks
● New Approach: GRE Tunnels
● GRE Tunnel between each XCP host.
● Build/Teardown as needed. Don't need to waste b/w.
● Administration nightmare?
○ Not if you had some sort of... controller... to manage it for you...?
○ Oh wait! We have one of those!
XCP Tunnel PIF
● Special PIF called "tunnel" in XCP.
● Commands: xe tunnel-*
● Placeholder for OpenvSwitch & DVSC to work with.
XCP Tunnel PIF
1. Create new network in XCP:
xe network-create name-label="Cross Server Private Network"
2. Create tunnel PIF on each XCP host for use w/ this net:
xe tunnel-create network-uuid=<uuid> pif-uuid=<uuid>
3. Add VIF's of VM's to this private network.
DVSC will handle the setup/teardown of GRE tunnels between XCP hosts automatically as needed.
Statistics and Monitoring with Xen Cloud Platform
Statistics, Monitoring, Analysis
● Citrix XenCenter
● Existing Solutions (Hyperic, Nagios, Cacti, Observium)
● Programmable Means:
○ API
○ SSH
○ SNMP
Citrix XenCenter
● Built in graphical presentation of all XenServer/XCP metrics.
● Live view of current activity.
● Memory allocation per host, per pool.
● Excellent way to get solid overview of XCP deployment.
● VirtualBox/Parallels/Vmware + Windows
XCP and Nagios
● XCP == CentOS 5.x (+ Xen + Kernel + XAPI)
● Install NRPE on dom0.
● Monitor just like any other Linux box.
XCP and SNMP
● net-snmp installed on XCP.
● Simple steps to enable SNMP:
a. Open UDP/161 in /etc/sysconfig/iptables
b. Adjust /etc/snmp/snmpd.conf permissions
c. chkconfig snmpd on && service snmpd start
● Standard Linux host metrics.
Monitoring XCP with the XenAPI
● Linux SNMP and Nagios NRPE only give basics.
● SR usage? Pool utilization?
● VM metrics? VIF/VBD rates?
● All of this information is available.
Monitoring XCP with the XenAPI
XenAPI and SR Metrics>>> import XenAPI>>> from pprint import pprint>>> session = XenAPI.Session('http://127.0.0.1')>>> session.login_with_password('root', 'secret')>>> session.xenapi.SR.get_all()['OpaqueRef:18c80a5d-cef6-c2e8-59d1-a03cfbed97e5', 'OpaqueRef:94f13ac8-6d8b-9bc0-2c71-fd29c9636f4e', ...]>>> >>> pprint(session.xenapi.SR.get_record('OpaqueRef:18c80a5d-cef6-c2e8-59d1-a03cfbed97e5'))
XenAPI and Events>>> import XenAPI>>> from pprint import pprint>>> session = XenAPI.Session('http://127.0.0.1')>>> session.login_with_password('root', 'secret')>>> session.xenapi.event.register(["*"])''>>> session.xenapi.event.next()
See examples on http://community.citrix.com/
Enterprise Cloud Orchestration and XCP
Enterprise Cloud Orchestration
● Hypervisor Agnostic* approach to orchestrating your cloud(s).
● Suited for solving multi-tenancy requirements.
● Orchestrate vs Manage?
● I'm not a cloud provider. Why do I care?
○ Traditional approach.
○ Developer delegation
IaaS Orchestration & XCP
OpenStack http://www.openstack.com
CloudStack http://www.cloudstack.org
OpenStack Overview
● Rackspace & NASA w/ other major contributors:○ Intel & AMD○ Red Hat, Canonical, SUSE○ Dell, HP, IBM○ Yahoo! & Cisco
● Hypervisor Support:○ KVM & QEMU○ LXC○ Xen (via libvirt)○ XenServer, Xen Cloud Platform, XenAPI (Kronos)
OpenStack Overview
● Language: Python
● Packages for Ubuntu and RHEL/CentOS (and more)
● MySQL and PostgreSQL (yay!) Database Support
● Larger project than CloudStack, encompassing many more functional areas:
○ Storage (swift, nova volume --> cinder)○ Networking (nova network, quantum)○ Load Balancing (Atlas)
OpenStack and XCP
● http://wiki.openstack.org/XenServer/GettingStarted
● http://wiki.openstack.org/XenServer/XenXCPAndXenServer
● Optimize for XenDesktop on Installation (EXT vs LVM)
● Plugins for XCP host: /etc/xapi.d/plugins
● Different way of thinking -- the Xen way
○ Run OpenStack services on host/dom0? No!○ Each XCP host has a dedicated nova VM.○ OpenStack VM will control XCP host via XenAPI
OpenStack and XCP Pools
● XCP Pools / OpenStack Host Aggregates
○ http://wiki.openstack.org/host-aggregates
○ Informs OpenStack that the XCP hosts have a collection of shared resources.
○ Works but incomplete -- e.g. if pool master changes?
○ Recommended that you don't pool your XCP hosts when orchestrating via OpenStack, for now...
● Traditional vs Cloud Workloads
OpenStack and XCP Storage
● Optimize for XenDesktop on XCP installation.
○ Local SR uses EXT instead of LVM
● Plugins need raw access to VHD files on host/dom0.
● Can use NFS for instance image storage:
○ Switch default SR to an NFS SR.○ nova.conf: sr_matching_filter="default-sr:true"
● OpenStack Cinder will use Storage XenMotion
CloudStack Overview
● VMOps aka Cloud.com ---> Citrix July 2011
● Hypervisor Support:
○ Citrix XenServer (thus XCP)○ KVM○ VMware vSphere○ Oracle VM
● Multiple hypervisors in single deployment
● Languages: Java and C
CloudStack and XCP
● CloudStack doesn't provide storage -- no nova-volume
● CloudStack uses existing SAN/NAS appliances:
○ Dell Equalogic (iSCSI)○ NetApp (NFS and iSCSI)
● Primary and Secondary Storage (tiering)
● Supports use of additional XenServer SR's (e.g. FC) instead of NFS/iSCSI.
{Open,Cloud}Stack -- Which?
● Depends on your team, experience, and intentions.
● CloudStack:
○ Want a cloud *now*?○ Very mature and full featured.○ Integrates well w/ both traditional & cloud workloads.
● OpenStack:
○ Have some time?○ Easily extendable to do new things (Python).○ XS/XCP support needs work, but its getting there.
Questions?
Unit 4The Future of Xen
Update from the Xen.org team
Outline
Xen.org development: Who / What?Xen 4.2Microsoft, UEFI secure boot, and Win8Xen 4.3Other activities
Xen.org development
Who develops Xen?7 full-time developers from CitrixFull-time devs from SuSe, OracleFrequent contributions from Intel, AMD
What do we develop?Xen hypervisor, toolstackLinuxqemu
Xen 4.2 features
pvops dom0 supportNew toolstack: libxl/xlcpupoolsNew scheduler: credit2memory sharing, page swappingnested virtualizationLive fail-over (Remus)
libxl/xl
The motivation:xend: Daemon, pythonxapi: duplicated low-level code
The solutionlibxl: lightweight library for basic tasksxl: lightweight, xm-compatible replacement
cpupools
The motivationService model: rent cpus, run as many VMS
as you wantAllow customers to use "weight"
The solution: cpupoolspools can be created at run-timecpus added or removed from poolsdomains assigned to poolseach pool has a separate scheduler
cpupools, con't
UsesNew service modelDifferent schedulersStronger isolationNUMA-split
UEFI secure boot
Microsoft, UEFI, and Windows 8 logoWhat that means for LinuxFedora's solutionUbuntu's solutionWhat it means for Xen
Xen 4.3
PerformanceNUMA issues*BSD dom0 supportMemory sharing / hypervisor swapARM serversblktap3
Other areas of focus
Distro integrationDoc days
Questions?
Closing Remarks
Useful Resources and References
Community:● Xen Mailing List: http://www.xen.org/community/● Xen Wiki: http://wiki.xen.org● Xen Blog: http://blog.xen.org Discussion:● http://www.xen.org/community/xenpapers.html● Abstracts, slides, and videos from Xen Summits● http://pcisecuritystandards.
org/organization_info/special_interest_groups.php
Image Credits
● http://en.wikipedia.org/wiki/File:Tux.png● http://en.wikipedia.org/wiki/File:Intertec_Superbrain.jpg● http://wiki.xen.org/wiki/Xen_Overview
Thank You!Enjoy the rest of OSCON 2012!
XCP Architecture
Acknowledgments
This work is based upon many materials from the 2011 Xen Day Boston slides, by Todd Deshane, Steve Maresca, Josh West, and Patrick F. Wilbur.
Portions of this work are derived from the 2010 Xen Training / Tutorial, by Todd Deshane and Patrick F. Wilbur, which is derived from the 2009 Xen Training / Tutorial as updated by Zach Shepherd and Jeanna Matthews from the original version written by Zach Shepherd and Wenjin Hu, originally derived from materials written by Todd Deshane and Patrick F. Wilbur. A mouthful!
Portions of this work are derived from Mike McClurg's The Xen Cloud Platform slides from the July 2012 Virtual Build a Cloud Day.
Portions are based upon Jeremy Fitzhardinge's Pieces of Xen slides.
Portions of this work are inspired by Jeremy Fitzhardinge's Pieces of Xen slides.