Bringing&ARMServers&to&the&Cloud:& … · • Using baremetal&or&LXC& ... •...
Transcript of Bringing&ARMServers&to&the&Cloud:& … · • Using baremetal&or&LXC& ... •...
Bringing ARM Servers to the Cloud: Experiences and Opportuni<es
Ripal Nathuji Calxeda
CloudOpen 2012
Background Combining themes • Power management • Mul<core systems • Virtualiza<on • Scale-‐out server and network infrastructure
Ø Cloud compu<ng
2
Outline • Mo<va<on: GreenClouds • Cloud stacks on ARM • OpenStack case study • Baremetal clouds • Future and opportuni<es • Conclusions / Q&A
3
POWER EFFICIENT SCALABLE CLOUDS GreenClouds
4
The impact of scale
Source: EPA report to Congress 2007
5
The cost of power • Encumbered cost != electricity cost
• Trends – HW costs going down – Energy costs going up
Servers 56%
Power & Cooling
Infrastructure 24%
Power 15%
Other Infrastructure
5%
Illustra<ve model (h^p://perspec<ves.mvdirona.com/2008/11/28/ CostOfPowerInLargeScaleDataCenters.aspx)
6
Op<mizing for power
• Goals – Maximize infrastructure deployment within power/cooling constraints – Op<mize infrastructure costs under SLAs (Perf / Wa^ / $)
• Requirements – Fungible HW resources – Efficient mapping of workloads to HW => Heterogeneous HW for heterogeneous mix of applica<ons
Cloud Infrastructure
Cloud sohware stack
Physical hardware
Applica<on Applica<on Applica<on
7
Physical hardware
Innova<ons in cloud hardware
PCM
Fusion-‐io
8
BUILDING CLOUD STACKS ON ARM Cloudy with a chance of ARM
9
Open source IaaS clouds
Server OS
Linux KVM Virtualiza<on
sohware
Cloud sohware
10
Applica<on Applica<on
The typical case for virtualiza<on Workload consolidaBon
-‐ Map mul<ple workloads to a scaled-‐up server -‐ Issues: Guaranteed performance, resource isola<on, etc
Manageability
Applica<on
• SW interfaces/abstrac<ons – Resource scaling – Network management (e.g.
vswitch) – …
Guest kernel/OS
Host OS / Hypervisor
12 Core / 64GB Memory Server
11
Applica<on Applica<on
Revisi<ng assump<ons Workload consolidaBon
-‐ Physicaliza<on: 1:1 mapping from app to plalorm -‐ Guaranteed performance, resource isola<on, etc
Manageability
Applica<on
12 Core / 64GB Memory Server
• OOB management support in HW for basic control
• Use of standard infrastructure interfaces – IPMI – OpenFlow enabled switches 4 Core / 4GB Memory Server
Guest kernel/OS
Host OS / Hypervisor Kernel/OS
Applica<on
12
Instances on baremetal
• Cloud controller manages physical machines and network infrastructure – Provisions OS, network, etc
• User provides kernel and OS file system – Dedicated access to I/O devices, etc
Cloud Controller +
Applica<on
User Kernel/OS
Server HW
13
Linux containers (LXC)
• Light weight virtual systems – Process and resource isola<on provided by kernel – “chroot on steroids”
• LXC support available today in OpenStack
App
Host Kernel
Server HW
LXC Containers
App App
14
Instances with Linux containers
• Cloud controller manages containers through SW interface (e.g. libvirt) – Create instance, bridge interface, etc
• User provides OS file system, but cannot specify kernel • Consolida<on possible via mul<ple containers
Cloud Controller +
App
Host Kernel
Server HW
LXC Container
15
Clouds with ARM • Using baremetal or LXC – Tradeoffs specific to deployment requirements and use cases
• Real examples – LXC with OpenStack – Baremetal with OpenStack and CloudStack
16
OPENSTACK WITH LXC: TRYSTACK ARM ZONE
Try a test drive
17
TryStack
18
TryStack ARM Zone
Joint effort across industry partners to provide the first publicly available access to ARM servers using OpenStack
19
OpenStack overview
Basic Essex deployment • Iden<ty service: Keystone • Image service: Glance • Cloud controller service: Nova
Keystone (Iden<ty)
Glance (Image)
Nova (Controller)
Scheduler
API
Network
…
20
Realizing OpenStack in CoreNap
• Dedicated service hosts for deployment/OpenStack services • ARM hosts dedicated as compute nodes
Keystone Glance
Nova
+
Nova compute
21
Network configura<on
• OpenStack networking is extremely flexible – Many different configura<ons possible
• TryStack zone requirements – Each compute instance needs a public IP – Maintain private network for compute hosts
• Chosen configura<on – Configure public IP directly into instance (instead of floa<ng IPs) – Point instances to hardware gateway
Container
Host OS
Container Container
10.x.x/24 Linux bridge
208.123.85.x 208.123.85.y 208.123.85.z
22
Automated deployment of OpenStack
• Need to get from baremetal to a cloud • We built a deployment framework of PXE+Chef
– Ubuntu network installer+preseed – Opscode Chef cookbooks for Keystone, Glance, and Nova (controller/compute) (Working with OpsCode to incorporate as part of public open source cookbooks)
23
Installing ARM compute node
1. Setup PXE files (kernel, initrd, preseed, and post-‐install script) 2. Power on nodes via IPMI 3. Nodes install Ubuntu, and setup Chef as part of post-‐install 4. Boot into installed OS, and execute Nova compute cookbook via Chef 5. Nova-‐compute service starts and the node is part of the cloud
Nova compute
+
24
Basic monitoring Detec<ng outages w/Nagios • Simple to setup basic node and
service monitoring • Combine both internal and
external monitoring Measuring resource consump<on with Munin • Package install and basic
configura<on can provide lots of data
25
Opera<onal lessons • Importance of automa<on
framework – Chef, Puppet, Juju, etc – Scalable deployment – Adding/changing capabili<es or
configura<on op<ons – Provisioning of addi<onal nodes when
capacity increases • Service logging
– OpenStack components are good about logging
– Makes it possible to debug issues, though combing across distributed log files can be cumbersome
26
TryStack ARM zone summary • Real world example of running a cloud on ARM hardware
• Use of Linux containers with OpenStack proving the technology stack
• Live since mid-‐July @ h^p://arm.trystack.org – More details on geyng access at h^p://trystack.org
27
BAREMETAL PROVISIONING Pedal to the metal
28
Baremetal use cases
• Physicaliza<on – When your applica<on requires resources of an en<re node
• Performance – Na<ve I/O – Guaranteed resource alloca<ons
App
App
App
29
Baremetal deployment
• No host OS/hypervisor to manage through • Need plalorm and infrastructure support
Applica<on
Physical server
Host OS/HV
User kernel/OS
Applica<on
Physical server
User kernel/OS
vs.
30
Plalorm requirements
• Turn “container” on/off – Turn physical serve on/off via IPMI
• Image pull by host – PXE boot support on physical server
• Provision specified kernel/OS – Configure PXE server with user ar<facts
Physical server Provisioning PXE server
User kernel/OS
IPMI
31
Baremetal instances in OpenStack
• Baremetal support introduced in Essex – Ini<ally targeted for Tilera based systems used by ISI
• Con<nued development for Folsom release includes support for PXE+IPMI – Lead by ISI and NTT Docomo
OpenStack Essex
OpenStack Folsom
32
OpenStack Folsom baremetal support
1. On instance crea<on, configure PXE server on nova-‐compute with node specific configura<on 2. Power system via IPMI, and load infrastructure kernel/initrd via PXE 3. Node exports drive as iSCSI target, and invokes a remote u<lity service 4. U<lity service formats and images drive via iSCSI endpoint, including copying user kernel/initrd.
At the end it modifies PXE config to cause node to boot from disk 5. Node reboots (node s<ll PXE boots, but is pointed to local image)
Physical server Nova-‐compute server
1 2 3 4 5
33
CloudStack Anton baremetal WIP
• CloudStack provides baremetal framework via network service providers • Ac<vely being integrated into Acton by Citrix
– Baremetal zones with DHCP and PXE service providers • PXE service providers
– PING for x86 – Ac<ve development of alterna<ve PXE service provider for ARM
PXE service provider DHCP service provider
DHCPD DNSMASQ PING ARM
34
CloudStack baremetal support
1. On instance crea<on, configure DHCP provider with IP and PXE service provider with node specific configura<on
2. Power system via IPMI, and load infrastructure kernel/initrd via PXE 3. Image node using template informa<on provided via kernel boot parameter 4. Node reboots and subsequently boots from disk
Physical server CloudStack controller
1 2 3 4
35
Baremetal challenges • Users have ownership of physical HW – Feature capabili<es
• Different security/isola<on model – Tenant network isola<on App
App
Switch
36
Baremetal feature parity • Loss of host OS/HV introduces possible feature gaps – Snapshoyng – Live migra<on – ….
Applica<on
Physical server
Host OS/HV
User kernel/OS
37
Baremetal network management
• Device configura<on – Isolate tenants using VLANs and ACLs on network devices – Comes with usual VLAN and ACL issues (scalability, manageability, etc)
• OpenFlow – Use OpenFlow enabled switches for programming the network – NTT baremetal implementa<on support
App App Switch
38
Baremetal summary • Use cases include physicaliza<on and/or when na<ve performance is impera<ve
• Both OpenStack and CloudStack will have support – Other solu<ons like Ubuntu MaaS also coming soon
• Challenges include feature differences and managing network isola<on
39
BUILDING TOWARDS THE FUTURE Realizing the vision
40
Future opportuni<es • Rapidly evolving ARM roadmap
• Building beyond IaaS
• Dynamic management algorithms
41
ARM roadmap • Increasing performance and core counts – Future genera<ons will build on success of A9s
• Virtualiza<on support – Coming with A15s
• ARMv8 architecture – 64-‐bit ISA
Linux KVM
42
Building from IaaS to PaaS
• Open source PaaS sohware evolving rapidly • PaaS deployments will soon become more prominent
– IaaS providers adding flavors of PaaS • PaaS abstrac<ons can completely hide underlying physical hardware
43
Building GreenClouds
• Infrastructure and applica<ons targeted for scale-‐out • Power efficient clouds will require more than just bringing HW, stacks, and applica<ons together
– Dynamic management of applica<ons and infrastructure – Applica<on instrumenta<on hooks (ex. integrated as part of PaaS)
Applica<on Applica<on Applica<on
44
Conclusions • ARM servers are here and ready for the cloud
• Open source cloud sohware ecosystem is rapidly growing and maturing – Virtualiza<on, IaaS, and PaaS sohware
• Lots of work leh to do to achieve the possibili<es enabled by innova<ons in sohware and hardware
45
Open source SW used
46
Acknowledgements
47
Ques<ons
48