Building High Availability in OpenStack Sergii Golovatiuk Mirantis

download Building High Availability in OpenStack Sergii Golovatiuk Mirantis

of 23

Transcript of Building High Availability in OpenStack Sergii Golovatiuk Mirantis

  • 7/25/2019 Building High Availability in OpenStack Sergii Golovatiuk Mirantis

    1/23

    Building High Availabilin OpenStack

    Created by Sergii Golovatiuk

  • 7/25/2019 Building High Availability in OpenStack Sergii Golovatiuk Mirantis

    2/23

    Logical architecture

    Cloudmanagement

    FUEL(Provisioning and deployment)

    Horizon(Self-service Web UI/Dashboard

  • 7/25/2019 Building High Availability in OpenStack Sergii Golovatiuk Mirantis

    3/23

    Logical architecture

    Nova(Compute)

    Cinder(Block Storage)

    Neutron(Networking)

    Core IaaSSwift

    (Object Storage)

    Glance

    (Image Mgmt)

    Keyston(Identity)

    Cloudmanagement

    FUEL(Provisioning and deployment)

    Horizon(Self-service Web UI/Dashboard

    Ceilometer

    (Telemetry)

  • 7/25/2019 Building High Availability in OpenStack Sergii Golovatiuk Mirantis

    4/23

    Logical architecture

    Nova(Compute)

    Cinder(Block Storage)

    Neutron(Networking)

    Core IaaSSwift

    (Object Storage)

    Glance

    (Image Mgmt)

    Keyston(Identity)

    Cloudmanagement

    FUEL(Provisioning and deployment)

    Horizon(Self-service Web UI/Dashboard

    PaaS elements Heat(Orchestration)

    Murano(App catalogue)

    Ceilometer

    (Telemetry)

    Sahara(Data processing)

  • 7/25/2019 Building High Availability in OpenStack Sergii Golovatiuk Mirantis

    5/23

    Physical architecture

    FUEL masternode

    FUEL API

  • 7/25/2019 Building High Availability in OpenStack Sergii Golovatiuk Mirantis

    6/23

    Physical architecture

    Controller

    node 1

    Controller

    node 2

    Controller

    node 3

    OpenStack

    APIs

    OpenStack

    APIs

    OpenStack

    APIs

    Load Balancer (HAproxy)

    FUEL masternode

    FUEL API

    OS: Linux (Ce HA setup (HA

  • 7/25/2019 Building High Availability in OpenStack Sergii Golovatiuk Mirantis

    7/23

    Physical architecture

    Controller

    node 1

    Controller

    node 2

    Controller

    node 3

    OpenStack

    APIs

    OpenStack

    APIs

    OpenStack

    APIs

    Load Balancer (HAproxy)

    FUEL masternode

    FUEL API

    Computenode 1

    Computenode 2

    Computenode 3

    OS: Linux (Ce HA setup (HA

    OS: Linux (Ce

    Hypervisor: K

  • 7/25/2019 Building High Availability in OpenStack Sergii Golovatiuk Mirantis

    8/23

    Physical architecture

    Controller

    node 1

    Controller

    node 2

    Controller

    node 3

    OpenStack

    APIs

    OpenStack

    APIs

    OpenStack

    APIs

    Load Balancer (HAproxy)

    FUEL masternode

    FUEL API

    Computenode 1

    Computenode 2

    Computenode 3

    Storagenode 1

    Storagenode 2

    Storagenode 3

    OS: Linux (Ce HA setup (HA

    OS: Linux (Ce

    Hypervisor: K

    OS: Linux (Ce Storage backe

  • 7/25/2019 Building High Availability in OpenStack Sergii Golovatiuk Mirantis

    9/23

    OpenStack High Availability St

    HA Management - Corosync/Pacemak Networking HA

    Database - MySQL

    AMQP - RabbitMQ

    API Services

    Cache - Memcached

    Storage - Ceph

  • 7/25/2019 Building High Availability in OpenStack Sergii Golovatiuk Mirantis

    10/23

    HA Management - Corosync/Pacema

    Yes, its complex

  • 7/25/2019 Building High Availability in OpenStack Sergii Golovatiuk Mirantis

    11/23

    HA for Network Connectivit

    Link Aggregations: Round Robin

    Active-Passive

    XOR LACP (my favorite)

    - Requires switch configuration

  • 7/25/2019 Building High Availability in OpenStack Sergii Golovatiuk Mirantis

    12/23

    Database HA - MySQL / Gale

    MySQL 5.6 with patches from Codersh xtrabackup from Percona

    HAProxy + xinetd httpcheck

  • 7/25/2019 Building High Availability in OpenStack Sergii Golovatiuk Mirantis

    13/23

    MySQL/Galera - OCF Script

    Use Latest GTID info for PC election from CIB

    from grastate.dat

    Start PC with empty gcomm://

    Clone based

  • 7/25/2019 Building High Availability in OpenStack Sergii Golovatiuk Mirantis

    14/23

    Messaging HA - RabbitMQ

    Hard to reassemble RabbitMQ cluster Each Node tries to connect to previou

    queue master

  • 7/25/2019 Building High Availability in OpenStack Sergii Golovatiuk Mirantis

    15/23

    AMQP - OSLO.MESSAGING

    multiple rabbit connection included OSLO.messaging heartbeats

  • 7/25/2019 Building High Availability in OpenStack Sergii Golovatiuk Mirantis

    16/23

    RabbitMQ - OCF Script

  • 7/25/2019 Building High Availability in OpenStack Sergii Golovatiuk Mirantis

    17/23

    API Endpoint Load Balancin

    HAProxy Based Stateless (Active-Active)

  • 7/25/2019 Building High Availability in OpenStack Sergii Golovatiuk Mirantis

    18/23

    Cache HA - Memcached

    Keystone stores tokens in memcache Horizon keeps sessions in memcached

    Do we really need HA for memcached?

  • 7/25/2019 Building High Availability in OpenStack Sergii Golovatiuk Mirantis

    19/23

    Storage HA - Ceph

    ephemeral storage = live migration object and image storage shared

    internal HA mechanism based on PAX

  • 7/25/2019 Building High Availability in OpenStack Sergii Golovatiuk Mirantis

    20/23

    Main Deployment Concepts

    Automate! Do not even try to do it man based on our pacemaker puppet reso

    types/providers for corosync

    error handling (retries)

    patching based asymmetric cluster

  • 7/25/2019 Building High Availability in OpenStack Sergii Golovatiuk Mirantis

    21/23

    Testing High Availability

    HA testing framework integrated with CI

    destructive tests performance testing during failure

  • 7/25/2019 Building High Availability in OpenStack Sergii Golovatiuk Mirantis

    22/23

    Results

    Controller restart network partitioning or port failure

    DB node failure handling

    AMQP node failure handling

    API Endpoint service failure handling Storage node failure handling

  • 7/25/2019 Building High Availability in OpenStack Sergii Golovatiuk Mirantis

    23/23

    Open Issues

    High Availability for Neutron L3 Agents: Virtual Routers (in Fuel 6.1)