OpenStack HA

23
OpenStack High Availability Jakub Pavlik

Transcript of OpenStack HA

Page 1: OpenStack HA

OpenStack High Availability

Jakub Pavlik

Page 2: OpenStack HA

About meJakub Pavlík• Cloud Platform Engineer• 3 years in Cloud• 2 years in OpenStack

Page 3: OpenStack HA

High Availability vs. Disaster RecoveryHigh Availability = fault detection & correction procedures to maximize availability of critical services and applications, often in an automated fashion.

Disaster Recovery = process of preparing for recovery or continuation of technology infrastructure critical to an organization after a natural or human-induced disaster.

High Availability ≠ Disaster Recovery!

Page 4: OpenStack HA

Four types of HA in an OpenStack Cloud

Physical infrastructure

OpenStack Control services

VMs OpenStack Compute

ApplicationsCompute ControllerNetwork Controller

DatabaseMessage Queue

Storage....

Physical nodesPhysical networkPhysical storage

HypervisorHost OS

….

Service ResiliencyQoS Cost

TransparencyData Integrity

…..

Virtual MachineVirtual NetworkVirtual Storage

VM Mobility…

Page 5: OpenStack HA

Physical Infrastructure

Page 6: OpenStack HA

Controller 1 Controller 2

SAN 1 SAN 2

Passthru 2Passthru 1

Controller 1 Controller 2

SAN 1 SAN 2

Passthru 2Passthru 1

Switch 1 Switch 2

168 cores 3,46GHz ,336 threadsagregation ¼ : 1344 vCPU

2688 GB RAM28 x 10GE ports

168 cores 2,67GHz ,336 threadsagregation ¼ : 1344 vCPU

1792 GB RAM28 x 10GE ports

tcp cloud VPCHardware

Page 7: OpenStack HA

OpenStack Control services

Page 8: OpenStack HA

OpenStack modules – TCP VPC

Page 9: OpenStack HA

Stateless services• There is no dependency between requests• For example APIs: Nova, Keystone, Glance, Cinder, etc.

Stateful services• An action typically compromises multiple requests• For example: MySQL, RabbitMQ, etc.

OpenStack High Availability Concepts

Active/Passive• Redundant instances of stateless services are load balanced• For Stateful services a replacement resource can be brought

onlineActive/Active

• Redundant instances of stateless services are load balanced• Stateful services are managed in such a way that services are

redundant, and that all instances have and identical state.

Page 10: OpenStack HA

Corosync• Totem single-ring ordering and membership

protocol• UDP and InfiniBand based messaging, quorum,

and cluster membership to PacemakerPacemaker

• High availability and load balancing stack for the Linux platform.

• Interacts with applications through Resource Agents (RA)

HAProxy• Load Balancing and Proxying for HTTP and TCP

Applications• Works over multiple connections• Used to load balance API services

Corosync, Pacemaker and HAProxy

Page 11: OpenStack HA

• MySQL patched for wsrep (Write Set REPlication)

• Active/active multi-master topology

• Read and write to any cluster node

• True parallel replication, in row level

• No slave lag or integrity issues

MySQL GaleraSynchronous multi-master cluster technology for MySQL/InnoDB

Page 12: OpenStack HA

Sample OpenStack HA architectureStateful

• Cinder Volume• Neutron L3, DHCP agents• Ceilometer central agent• RabbitMQ

Stateless• Neutron Server• OpenStack APIs• Apache web server• Nova Scheduler• Cinder Scheduler

Neutron agents(Active)

Neutron agents(Hot Standby)

Page 13: OpenStack HA

VMs – Compute nodes

Page 14: OpenStack HA

Storage• Shared storage filesystem – file disks (qcow2, vmdk, vhv)• Block storage

Network• Vanilla Neutron L3 agent (OpenVSwitch, Linux Bridge)• Vendor plugins - SDN controller

VMs HA – two layers

Page 15: OpenStack HA

No vSphere Style HA with KVM

Page 16: OpenStack HA

Shared Storage• Live migration – just RAM memory• Hypervisor Evacuation – The instance will be booted from

same disk and data will be preserved • CEPH, Gluster, NFS, Samba, GFS

Non-Shared Storage• Block Live Migration – disk and RAM• Hypervisor Evacuation – the instance will be booted from a

new disk, but will preserve the configuration, e.g. id, name, uuid

• Standard filesystem EXT4, etc.

Non-Shared/Shared Storage filesystem

Page 17: OpenStack HA

• Instance boots from volume• iSCSI/FC direct mapping to instance• Enable Live Migration• Cinder Backends

• LVM Driver• Default linux iSCSI server

• Vendor software plugins• Gluster, CEPH, VMware VMDK driver

• Vendor storage plugins • EMC VNX, IBM Storwize, Solid Fire, etc.

Block Storage - Cinder

Page 18: OpenStack HA

Problems• Routing on Linux server (max. bandwith approximately 3-4

Gbits) • Limited distribution between more network nodes• East-West and North-South communication through network

node High Availability

• Pacemaker&Corosync• Keepalived VRRP• DVR + VRRP – should be in Juno release

Networking - Vanilla Neutron L3 agent

Page 19: OpenStack HA

Examples• Juniper OpenContrail, VMware NSX, SDN PLUMgrid

Advantages against Neutron L3 agent• North-South communication on network devices (iBGP,

MLPSoverGRE) • East-West communication directly between compute nodes• Higher bandwidth (9.7 Gbits per 10Gbits port)

High Availability• iBGP peering into two routers• Native HA implemented inside of network devices

Networking – Vendor SDN Controller plugins

Page 20: OpenStack HA

OpenStack HATCP VPC

MySQL RabbitMQ

Openstack Controller

GALERA

Zookeeper

Cassandra

Contrail Database

Contrail Config with Analytics & WebUI

Contrail Control

Zookeeper

Cassandra

Contrail Database

MySQL RabbitMQ

Openstack Controller

MySQL RabbitMQ

Openstack Controller

Zookeeper

Cassandra

Contrail Database

Contrail Control

Contrail Config with Analytics & WebUI

HAProxy HAProxy HAProxy

VIP

Bond Interface Pacemaker

Corosync

Contrail Config with Analytics & WebUI

PacemakerCorosync

Page 21: OpenStack HA

TCP Virtual Private Cloud

Page 22: OpenStack HA

HA methods - vendorsVendor Cluster/Replication Technique Characteristics

RackSpace Keepalived, HAProxy, VRRP, DRBD

Automatic - Chef

Red Hat Pacemaker, Corosync, Galera Manual installation/Foreman

Cisco Keepalived, HAProxy, Galera Manual installation, at least 3 controller

tcp cloud Pacemaker, Corosync, HAProxy, Galera, Contrail

Automatic Salt-Stack deployment

Mirantis Pacemaker, Corosync, HAProxy Galera

Automatic - Puppet

Page 23: OpenStack HA

Thank you for your attention!