Managing Openstack in a cloud-native way · Component HA model Web Services HAProxy HAProxy...
Transcript of Managing Openstack in a cloud-native way · Component HA model Web Services HAProxy HAProxy...
Managing Openstack
in a cloud-native way
• Red Hat Cloud Architect
• Over 5 years helping companies to
adopt emerging technologies
• Network engineer in a previous life
Alberto García
• Leading the Architecture of Swisscom’s
ElasticStack and PaaS
• Member of CloudFoundry’s Technical
Advisory Board
• Automate all the things!
• Background in SystemEngineering and
Software Development
Marcel Haerry
Our motivation
Use Cases
https://developer.swisscom.com
https://www.mycloud.ch
Modern IT philosophy at Swisscom
rapid release cycles
to iterate quickly on new features
and bugfixes
Promoting a devops culture throughthe teams
High availability andscalability as you grow
fault tolerant andsecure deployments
and lifecycle
Building platforms for the next
generation workload
Strong and thorough CI/CD approach.Highly automated and tested before promotion through stages.
Is it doable?
Openstack control plane
• Components are decoupled: load balancer,
messaging bus
• State is in the database
• Allows dynamic topologies: Can be scaled
in/out based on control plane load due to
workload usage
• Control plane services can be virtualized
• Openstack dedicated projects for deployment
automation
The pacemaker HA approach
• All in one deployment doesn’t scale as it is
(rabbitmq, galera)
• Big VMs doesn’t fit well in virtual environments
• Life cycle of baremetal is slow
• CI/CD is more complex -> How to iterate on
individual components?
• Clustering software is stateful
• Binding control plane to infrastructure
HAProxy/Keepalived HA approach
• Based on Javier Peña’s architecture https://github.com/beekhof/osp-ha-deploy/blob/master/HA-keepalived.md
• Pacemaker free architecture
• Distributed control plane fits well in this model
• Virtualization is feasible thanks to flexibility in
the services layout design
• Does not bind application to infrastructure
Seems doable,
let’s design it
Distributed & virtualized control plane
• Pulling the pieces apart towards a
distributed architecture
• Horizontal scalable services (wherever
possible)
• virtualized control-plane
• Isolate shared state
(Galera & RabbitMQ)
(Double) Highly Available Architecture
Component HA model
Web Services HAProxy
HAProxy Keepalived
Mysql Galera
Mongo Replica-set
Rabbitmq Rabbitmq native clustering
Redis Sentinel
Non-API components Resiliency in the
application
Application Level Infrastructure Level
Control Plane
• Hyperconverged
• High density hardware
• Network isolation of storage,
control & data
• Network HA with bonding
• Part of a layer 3 spine-leaf
design
• Local ephemeral storage
• Simple networking, one network
for everything
• Grouping services per major
component
• Including lightweight supporting
services in the role
• Small sized virtual machines
Compute
Modeling the components
Control Plane
Lifecycle• CI/CD Framework
Multiple stages to gain confidence in changes
Clear separation between code and configuration
• Puppet & Deployment Orchestrator for Puppet
Virtual Machines & Storage described in code
ScaleOut purely through API Calls
Storage
• Hyperconverged compute nodes
• Cinder with Scaleio
scales with the amount of disks & so servers
• ObjectStore
completely externally (Atmos)
• Glance
using external S3 Backend
caching of images in the control plane
distributed network
services for SDN
Big picture
our journey
Active-Active HA support in Openstack components
http://gorka.eguileor.com/simpler-road-to-cinder-active-active/
Bootstrapping clusters
• Monitor health
• automate simple
remediations
NO MAGICAL RECOVERY
Benefits & drawbacks
Cloud like architecture
• Control services can be treated as stateless applications
• Operation of Openstack control plane similar to cloud
workloads
• Dynamic and agile control plane for Openstack
• Cost effective solution (thanks to virtualization)
• Openstack control plane does not depend on
infrastructure
Cloud like day 2 operations
• Measurable & scalable per component
• On-boarding new services -> deploy new roles
• Parallel deployment of Control Plane for upgrades
• Backup only the stateful services, restage
everything else
• Redeployment of nodes in case of failure /
problems
Drawbacks
• Not fully A/A ready: Cinder-volume & Galera
• RabbitMQ/MariaDB don’t scale horizontally
• No magical recovery
• Network partitions & keepalived
• Horizon needs sticky sessions -> RRDNS does not
work
Future work• OpenStack components
Build services A/A from the beginning
Built-in health-endpoints in services (e.g. query from
HAProxy or monitoring)
• Deployment
Packaging deployment as containers (Kolla?!)
• Architecture
Decoupling storage from compute?
THANK YOU