Post on 18-Feb-2017
Lessons Learned RunningThe Largest OpenStack Clouds
KENNETH HUISenior Technical Marketing EngineerTechnology Evangelist@kenhuiny
RACKSPACE PUBLIC CLOUD
• 6 Geographic regions around the globe• Tens of thousands of hypervisors• Over 350,000 Cores, Over 1.2 Petabytes of RAM• Hundreds of thousands of virtual machines• Several hundred on-metal instances• Hundreds of thousands of virtual switch ports
Concept of Nova Cells to scale regions to 1,000 of nodes
Tempest: the initial QA test framework for OpenStack
OpenStack Ansibledeployment project
Magnum: the container management system
Rewriting the Swiftobject server in Go to Meethyper-scale demands
Barbican: the key management service
KEY COMMUNITYCONTRIBUTIONSTO OPENSTACK
RACKSPACE’SLEADERSHIP
• Freely share lessons learned • Contribute code and ideas to the OpenStack
project• Open source tools based on what we use to
operate our clouds
OPENSTACK INNOVATIONCENTER (OSIC)
THREE PILLARS1. Train the Next Generation of OpenStack Contributors2. Contribute to the removal of Enterprise barriers to OpenStack
adoption3. Provide an avenue for operational scale testing to the OpenStack
community
• Before OpenStack, there was Slicehost• Scaling limits led to OpenStack• Xen is Slicehost’s legacy in the Rackspace
Public Cloud• 10’s of thousands of existing customers meant
starting at scale• Private Cloud started with clean sheet of paper
ORIGIN STORY
RACKSPACE’SAPPROACH
• Continuously upgrade our public cloud– Deploy upstream OpenStack code– Patch regularly
• Only use projects stable enough to run in production at scale• Don’t reinvent the wheel• Change code in production to meet scale requirements
– Certain bugs we only find in production– Contribute back upstream when appropriate
• Move ahead of community when necessary– Create service with internal software– Contribute code and lessons learned to project– Switch to project code when ready
• Why Cells?– Scaling – DB & RabbitMQ, – Reduce failure impact– Broadcast domains/ Nova– Multiple compute flavors – SSD– Multiple hardware types
• How we use Cells– ~100 hosts per cell – scaling/failure impact– Multiple cells per region – Failure impact– Group same flavor types– Group servers from same vendor – Live migration
• Takeaways– Use cells from day 1– Plan for scale
PARTITIONYOUR CLOUD
ABSTRACT YOURCONTROL PLANE• iNova- Ancestor to TripleO
– Seed servers in each region– Seed servers & Cells runs on VMs– Easy to deploy, tear down, redeploy services– React to issues quickly - Spikes
• Virtualized compute nodes– Nova compute runs as VM on compute node– Limits impact of compute node failure– Reboot compute node but not hypervisor– Security isolation
• Takeaways– Explore TripleO – Red Hat OpenStack– Containerize your control plane – OSA– Protect your control plane – Use HA
AUTOMATE EVERYTHING• Operator error is more common than software
failure• Automation = Making time• OpenStack Ansible
– Encodes recommended practices– Rackspace Private Cloud RA– Highly customizable– Great community support
• Takeaways– Automation starts day 1– Pick an appropriate tool and run with it
USE FLEET MANAGEMENT• Failure is inevitable at scale• We created tools to manage the fleet
– Auditor – Monitor for rules compliance– Resolver – Automate tasks based on events– Use Cases• Upgrades and patches – Xen vulnerability live patch• Maintenance – Live migration
• Takeaways– Focus on service availability over component
availability– You can’t manage what you don’t know– Leverage live migration– Check out Project Craton
• Rackspace Public Cloudhttps://www.rackspace.com/cloud
• Rackspace Private Cloudhttps://www.rackspace.com/cloud/private/openstacksolutions
• OpenStack Innovation Centerhttps://osic.org/
• Rackspace Bloghttp://blog.rackspace.com/
• Rackspace Videos at OpenStack Summitshttps://www.youtube.com/user/OpenStackFoundation/playlists
• Project Cratonhttps://github.com/openstack/craton
RESOURCES
THANK YOU
ON E F AN AT I C AL PL AC E | SAN AN T ON I O, T X 7 8 2 1 8
US SALES: 1 -800-961-2888 | US SUPPORT: 1 -800-961-4454 | WWW.RACKSPACE.COM
© RACKSPACE LTD. | RACKSPACE® AND FANATICAL SUPPORT® ARE SERVICE MARKS OF RACKSPACE US, INC. REGISTERED IN THE UNITED STATES AND OTHER COUNTRIES. | WWW.RACKSPACE.COM