Using ADC Services to Build Openstack Clouds

White Paper

Using Production-Grade ADC Services to Build Scalable, Redundant OpenStack CloudsBest practices for multi-zone and multi-region cloud integration.

White Paper Using Mirantis OpenStack and Citrix NetScaler 2

citrix.com mirantis.com

Table of ContentsExecutive Summary 3Cluster Architecture Best Practices 4 High Availability Architecture for the OpenStack Control Plane 4 Use Host Aggregates to Abstract AZ Capabilities 6 Networking Considerations for HA Environments 6 Load Balancing Considerations 6 Considerations in Deploying Stateful Applications 7Production-Grade Openstack LBaaS with Citrix NetScaler 7 General considerations for performance and availability 8 Deploying NetScaler ADC Services Across Multiple Availability Zones in OpenStack 10 General Best Practices 10 Deployment Considerations for Load Sharing Between Multiple NetScalers Across Multiple AZs and Data Center 11Mirantis/Citrix Integration in MOS 12 Conclusion 12 Next Steps 12 Resources 13



This white paper offers a best-practice checklist for using OpenStack and Citrix NetScaler ADC services to build large-scale, high availability computing clouds. Topics discussed include:

Minimizing downtime and preventing data loss. Supporting a mix of cloud-aware/stateless and state-dependent

applications. Hosting multiple tenants (OpenStack Projects) on the same

shared infrastructure. Seamless, highly available application acceleration and load

balancing services. Complying with performance SLAs. Extending cloud services to multiple locations.

Our discussion will explore two main methodologies:

Resource segregation (pooling): Appropriate use of OpenStack constructs such as availability zones and host aggregates to group infrastructure into fault domains and HA domains, simplifying and optimizing application deployment, maintenance, issue-handling and disaster recovery.

Application Delivery Control (ADC): Best practices for using Citrix NetScaler technology to provide highly available, high performance, application delivery and load balancing services in a distributed, multi-tenant, fault-tolerant cloud architecture.

Mirantis and Citrix have collaborated to cover this complex topic space, each bringing to bear deep engineering expertise and much real-world customer experience. Mirantis, the leading pure-play OpenStack company, creator of the highly-praised Mirantis OpenStack distribution, is currently the #3 contributor to OpenStack core*1 and has built more large-scale enterprise, organizational and service-provider OpenStack clouds than any other integrator.

With NetScaler, Citrix pioneered ADC deployment in virtualized cloud architectures and has unique expertise in large-scale software-defined networking environments. This expertise is embodied in the NetScaler App Delivery Controller solution architecture, which answers requirements of organizations for service agility and automation, and assured application performance.



Cluster Architecture Best PracticesIts easier to build resilient and scalable OpenStack data centers if three best-practice rules are applied in planning:

Segregate physical resources to create fault domains, and plan to mark these as OpenStack Availability Zones. The principle, here, is to create stand-alone pools of control, compute, storage and networking capability that share no points of failure with neighboring pools (e.g., power, critical network connections), and marking these as AZs.

Distribute OpenStack and database controllers across three or more adjacent fault domains to create a resilient cluster. See High Availability Architecture for the OpenStack Control Plane, below.

Networks - Design considerations should include both data plane and control plane networking criteria for scale out and high-availability.

Applying these recommendations creates a self-contained OpenStack cluster comprising a minimum of three fault domains, visible as Availability Zones to admins and users. This visibility, in turn, makes it easier to deploy apps in resilient ways:

Fully stateless applications, e.g., web apps, can be made resilient by deploying redundant instances in separate AZs, behind load balancing (which can also be made resilient - see below). This ensures that failure of a single fault domain will not bring down the app.

Apps requiring quorum-type HA can be deployed across three separate AZs with their synchronization scaffolding, just as the OpenStack and storage (Ceph) controllers are.

Thereafter, scale-out can be done by adding Compute, Storage, and Network capacity within each fault domain up to OpenStack controller capacity, and thereafter, by adding infrastructure aggregations in their own fault domains, each with a new OpenStack controller and storage controller components.

High Availability Architecture for the OpenStack Control PlaneThe Mirantis OpenStack Reference Architecture implements HA for the OpenStack Control plane using the following set of approaches:

Three (3) dedicated Controller Nodes are deployed. Stateless components of OpenStack on each Controller node (Nova, Cinder, Glance) are placed

behind a Load Balancer (such as Citrix NetScaler), running in Active/Active mode. Stateful components of OpenStack (L3 gateway agent and DHCP agent) run in Active/Standby

mode, with failover performed by Pacemaker in case of node failure. Database with cloud state data (MySQL) runs as a cluster using Galera:

- Galera decision-making is based on a majority quorum algorithm, so in order to tolerate failure of a Controller, the number of nodes should be odd (3 to tolerate failure of 1 controller, 5 to tolerate failure of 2, etc.)

The message queue (RabbitMQ) used by OpenStack services to communicate with each other runs in clustered fashion using native RabbitMQ clustering capabilities.



Storage for VM boot disks, Images, Snapshots and Volumes is backed by a distributed storage platform of choice (Ceph, NetApp, EMC VNX, etc.) - This enables Live Migration and Evacuation of VMs (needs to be triggered from UI or API).

If Ceph -- the resilient multimodal data storage system -- is used to support Cinder (volume storage), and/or Glance (image storage) and Swift (object storage API), it employs simple redundancy to protect its Object Storage Daemon components, and provides its own quorum-based HA for Ceph Monitor components, requiring time synchronization among all three (or more) nodes.

Mapping Fault Domains to OpenStack Availability ZonesOnce OpenStack has been deployed, you can demarcate fault domains by assigning the resources in each domain to an OpenStack Availability Zone -- a user-visible name assigned to a Host Aggregate. Host Aggregates/Availability Zones are defined via the Nova client CLI or equivalent REST calls as documented (for OpenStack Juno) here. To summarize, you can create a host aggregate my_host_aggregate named as an availability zone my_availability_zone by issuing the CLI command:

$ nova aggregate-create my_host_aggregate my_availability_zone

Then list aggregates and their IDs via:

$ nova aggregate-list

And add hosts (e.g., my_hostname) to the availability zone by referencing the ID of the corresponding host aggregate:

$ nova aggregate-add-host my_hostname



Use Host Aggregates to Abstract AZ CapabilitiesUnnamed Host Aggregates, visible only to administrators, may be used to further partition each availability zone by numerous criteria (e.g., hypervisor type), adding key-value pairs to the aggregate definition to achieve any desired level of granular description. The key-value pairs are used by nova-scheduler/filters to further match workloads to available capacity and specific resource requirements (and throw errors when attempts are made to deploy workloads on inappropriate hosts).

With care, its also possible to define host aggregates that extend across all availability zones relevant to a particular workload familys redundancy requirements. The net result is to facilitate deploying stateless applications in a way that insures resilience: the primary consideration is simply to make sure that application instances are deployed on separate availability zones, then trust the host-aggregate-driven filters to find hardware that matches application requirements.

Networking Considerations for HA EnvironmentsNetworking considerations play a critical role for determining scale out and redundancy of your OpenStack cloud using fault domains.

Both control and data plane availability and scale out models should be considered not just for L2/L3 networking, but also for network services such as ADC and firewall services. In deployments using traditional segmentation models such as VLAN backed networks, the extent of the VLANs and the ability to stretch them across multiple physical domains needs to be taken into account. With new centralized controller based networking solutions being adopted for cloud architectures, an appropriate scalability and redundancy model for these control plane technologies should be carefully designed for, in addition to the data plane scalability and performance considerations.

Load Balancing ConsiderationsIn addition to connectivity and isolation, the availability and redundancy model for network services (load balancers, firewalls etc) need to be carefully considered while planning high availability for your OpenStack cloud. Primarily, the following factors need to be taken into account:

Scale out - A key aspect of designing for the cloud is the ability to add more capacity on demand to meet the needs of growing workloads. However, unlike typical enterprise deployments where additional scale and performance is achieved by replacing existing devices with higher-performance devices, a recommended approach is to add more nodes as incremental units of capacity. With this scale out approach, your load balancing capacity can grow proportionally along with the applications compute capacity. Vendor technologies should be evaluated for seamless scale out models on the control plane as well as the data plane while designing your cloud architecture.

High availability model (Active-active vs active-passive) - Most vendors for network services support either of these two modes of HA, and in some cases, both. Although an active-passive HA model is very popular in standard enterprise deployments, active-active is the preferred model for typical cloud architectures. The reason goes back to the fact that the cloud design is based on a scale out architecture where ideally, every node is actively processing requests while providing N+1 redundancy for the other nodes.



Considerations in Deploying Stateful ApplicationsAnother crucial aspect to consider during HA planning is the applications characteristics themselves. New applications designed for the cloud are typically stateless and are built to fail, with the philosophy of replacing failed nodes with new ones as opposed to repairing the ones that have failed. While this loosely coupled architecture is very well suited for the cloud model of capacity on demand, there may be legacy applications that still rely on shared state between multiple nodes and has to be taken into account. From an ADC standpoint, vendor technologies need to be evaluated for supporting synchronization and sharing of application state across multiple nodes in the scale out offerings for their ADC solutions.

Load balancing as a cloud service in OpenStack through Neutron LBaaSNeutron is the Network-as-a-Service (NaaS) layer of OpenStack, and LBaaS (Load-Balancing-as-a-Service) is an advanced service of Neutron. Neutron LBaaS offers an as a service consumption model for load balancing through a set of consumption APIs for deploying load balancing policies, which are agnostic of specific vendor technologies as well as abstracted away from the infrastructure complexities that are involved with managing load balancing appliances. Neutron LBaaS allows for commercial and open-source load balancing technologies to drive the actual load balancing of requests, giving the Openstack operator flexibility of choice on the backend technologies to use. While LBaaS is currently part of Neutron, plans are underway to make it a separate project within Openstack yet still under the Networking umbrella.

Production-Grade OpenStack LBaaS with Citrix NetScalerIn many cases, multiple datacenters are distributed across different geographical regions, and a global traffic distribution and load balancing mechanism needs to be employed for optimizing resource utilization across datacenters. All ADC vendors offer some form of global server load balancing (a DNS based mechanism that distributes traffic globally) for this purpose. However, vendor offerings need to be evaluated for their sophistication on key metrics such as user proximity, geo-location, available capacity, weighting, etc.

NetScalers OpenStack LBaaS integration has been designed as a production grade solution for organizations that are running business critical applications at scale. It addresses the operational concerns around running Infrastructure-as-a-Service (IaaS), while providing flexibility and control over performance, availability and scale.

NetScalers OpenStack LBaaS solution is based on a purpose-built orchestration product from Citrix called NetScaler Control Center (NCC), which simplifies the operational complexity involved in deploying LBaaS in OpenStack seamlessly integrating all NetScaler appliances both physical and virtual.



Figure 2. Citrix NetScaler integrates with OpenStack via driver to OpenStacks LBaaS Plugin, providing seamless control of application-level load balancing under the Horizon webUI or OpenStack Neutron client and REST APIs.

NetScaler Control Center (NCC) provides the following key benefits that enable a cloud consumption model for value added NetScaler ADC features, therefore making it easy for cloud providers to offer any NetScaler ADC or security function as a cloud service.

Capacity pooling across all NetScaler infrastructure. End-to-end automation across all NetScaler appliances. Guaranteed SLAs through service aware resource allocation. Integration with OpenStack KeyStone for single-sign-on authentication. Flexible placement algorithms for ADC policies. Centralized visibility and reporting for operational statistics. General considerations for performance and availability Choice of physical vs virtual appliances - NetScaler provides a wide choice of platforms that

can power Neutron LBaaS, ranging from physical, virtual and purpose built multi-tenant appliances. Customers are free to self select between these platforms purely based on performance, scalability and cost without having to make any changes in their LBaaS offering to their tenants. Customers that prefer the resiliency and reliability of purpose built hardware may opt for physical appliances where as the ones that prefer a purely software defined data center model would tend to adopt the virtual form factors.

Multi-tenancy isolation best practices - The usage of shared infrastructure forms the underpinning of the economies of scale in a cloud offering. However, the issue of how multiple tenants can be hosted on the same-shared infrastructure without compromising isolation and security needs to be adequately addressed. NetScalers Neutron LBaaS solution offers a wide choice of multi-tenancy isolation mechanisms for the provider to choose from: - Fully dedicated instances for maximum isolation and independence - Designed for mission

critical workloads, this isolation model allows fine grained hard walling of CPU, memory, throughput, SSL capacity and other critical resources for each tenant and constitutes the highest form of isolation.



- Shared instances - Ideal for test/dev workloads, shared instances can host the ADC workloads of multiple tenants and offer a cost effective and best effort solution for multi-tenancy.

- Partition based high-density multi-tenancy - Striking a fine balance between the isolation of dedicated instances and the capacity efficiency of shared instances, NetScalers admin partitions based multi-tenancy enables high density while allowing hard walling of certain critical parameters such as throughput, connections and memory.

Scalability and capacity on demand - NetScalers industry leading TriScale technology allows OpenStack providers to choose between various scalability options to increase capacity on demand - - Scale up - NetScaler supports a pay-grow licensing model on all its appliances where

additional capacity can be unlocked on any device by simply applying a corresponding license. - Scale out - NetScalers popular TriScale clustering technology allows for as many as 32

different nodes to be clustered together into a single logical NetScaler unit, with seamless synchronization of operational and configuration data.

Multi-datacenter Architecture Best PracticesAs noted previously, Mirantis uses native OpenStack segregation mechanisms to assemble multi-site clouds consisting of independent OpenStack installations:

Regions to enable share UI and authentication for geo-dispersed datacenters. Availability Zones to isolate fault domains within a datacenter (a rack, a power source, a

server room). Host Aggregates for grouping of compute nodes by user-defined arbitrary characteristics

(availability of directly attached SSD storage, server model, etc.).

Figure 3. Multi-region cloud in conventional configuration, without ADC. Controllers within each region are in HA configuration. Compute elements are segregated in AZs circumscribing fault domains. Storage is highly available.



Deploying NetScaler ADC Services Across Multiple Availability Zones in OpenStackGeneral Best Practices NetScaler offers multiple solutions for deploying ADC services across multiple availability zones in OpenStack. As these solutions are evaluated, the following best practices need to be kept in mind:

The ADC services need to scale elastically along with the application and should follow the availability model of the application. So having a NetScaler instance (or a pair of instances) in each availability zone and spraying load between them would provide fault isolation and protection not just for the compute resources, but also the ADC services for an application.

Each fault domain is a self-contained unit that is designed to fail independently, and so should be the case with ADC services. What this means is that typical best practice would be for each NetScaler in an AZ to load balance traffic to the local compute farm within the same AZ, instead of load balancing to other AZs . This would allow NetScaler instances in one fault domain to fail without impacting the other fault domains, and reduce the overhead of hair-pinning traffic from one AZ to another.

NetScaler instances across AZs should ideally be loosely coupled with minimal interaction and state exchange between them, which is ideal for cloud design. Exceptions can be made for highly stateful applications that need state synchronization across AZs.

Figure 4. Multi-region OpenStack cloud with NetScaler ADC installed in each region, in highly-available configuration.



Deployment Considerations for Load Sharing Between Multiple NetScalers Across Multiple AZs and Datacenters As mentioned above, typical best practice entails deploying a NetScaler instance or HA pair per AZ, with that instance load balancing traffic to local computer resources within the same AZ. The obvious next question to answer is how to distribute load across multiple AZs, on control and data planes.

Distributing NetScaler data plane across multiple AZs: Distributing traffic on the data plane always involves an external entity (such as an upstream router or a switch) that is distributing traffic. - Using ECMP and RHI:

Equal Cost Multi Path (ECMP) is a common traffic distribution mechanism at layer3 supported by all routers. A typical deployment would be for an upstream routing layer (like a data center core router) to use ECMP to distribute traffic to NetScalers across multiple availability zones. ECMP is based on a stateless hash mechanism that is flow safe, and therefore will always ensure that traffic from the same flow is always processed by the same NetScaler.

- Route Health Injection (RHI) is a mechanism that NetScaler supports to advertise the availability of services running on an instance through dynamic routing protocols like OSPF. In essence, RHI works by NetScaler injecting routes into OSPF for all the healthy services running on the instance. When a service becomes unhealthy or goes down, the route is automatically removed from the advertisements, and the upstream router will no longer direct traffic meant for that service to that NetScaler instance.

ECMP and RHI together are a very popular choice for scale out architectures of NetScaler across multiple availability zones.

- Using GSLB for multi-DC deployments: For deployments that span across multiple datacenters (and perhaps multiple geographical regions), NetScalers Global Server Load Balancing (GSLB) solution is best suited for distributing traffic across multiple data centers. However, each data center can have multiple availability zones, and again, ECMP + RHI is an effective solution for balancing load across those availability zones.

- TriScale clustering as a scale out solution for stateful applications: Some applications need persistence and user session state to be preserved and synced across nodes in different availability zones. For this type of application, TriScale clustering offers a fully stateful and operationally streamlined scale-out solution that guarantees even distribution of load across multiple AZs.

Distributing NetScaler control plane across multiple AZs: In an OpenStack environment, NetScaler Control Center (NCC) constitutes the centralized control plane for NetScaler appliances. The scalability and availability model of NCC needs to be carefully thought through while designing your OpenStack cloud architecture. A general best practice is to have a one-to-one correspondence between an NCC instance and an OpenStack controller node.



- Deploying NCC for HA within a region: Customers will be able to deploy multiple NCC instances in high availability configuration to form a single logical control plane, per region. This HA deployment of the control plane will have reachability to all NetScaler instances running across multiple AZs within a region, similar to the way OpenStack controller HA cluster manages resources across multiple AZs.

- Multi-region considerations: Typical best practices for multi-region architectures is to have completely separate OpenStack deployments in each region or geographically dispersed data center. For NetScaler ADC services, this corresponds to completely separate NCC deployments managing and controlling the NetScaler appliances within that region.

Mirantis/Citrix Integration In MOSDeploying a highly available OpenStack cluster (with optional Ceph support) is simplified by use of Mirantis OpenStack, whose wizard-driven Fuel installer can auto-deploy OpenStack with HA and/or Ceph with simple, wizard-based configuration. For PoCs or testing, Fuel can deploy HA and Ceph components on a single Controller node (HA is not fully functional in this case), and then deploy additional Controllers on the cluster (HA becomes active when three or more Controllers are deployed). For more information about Mirantis OpenStack and Fuel, see the Mirantis OpenStack Planning Guide and User Guide.

NetScaler ADC (all platforms, physical and virtual) is certified by Mirantis to interoperate with Mirantis OpenStack. A runbook for integrating NetScaler with MOS has been produced and verified by Mirantis engineers. Mirantis provides L1 and L2 support to Mirantis OpenStack users with NetScaler, and will escalate L3 issues to Citrix engineers for support.

ConclusionCitrix NetScaler ADC brings significant benefits to OpenStack cloud operations, administration, performance and resilience, particularly as clouds grow larger, extend to multiple regions, and as multi-tenant demands on each cloud data center become more diverse. As we hope this white paper demonstrates, NetScalers overall architecture and HA strategies dovetail well with proven OpenStack best practice for building resilient scale-out clouds.

Next StepsReaders interested in learning more about Citrix NetScaler ADC solutions and Mirantis OpenStack are encouraged to begin by visiting Citrix partner page on mirantis.com. The latest Mirantis OpenStack distribution can be downloaded free of charge at http://software.mirantis.com, and comes with a complimentary 30 days of free support -- ideal for evaluation and PoC implementations.

Mirantis is happy to discuss your plans to evaluate or implement OpenStack clouds with Citrix NetScaler ADC, and can support your efforts with a range of engineering services, including Architectural Design Assessments that put the knowledge and expertise of Mirantis cloud architects at your disposal quickly for concrete input and direction. To schedule an Architecture Design Assessment (ADA), please contact us https://online.mirantis.com/contact-us.

0515/PDF

Corporate HeadquartersFort Lauderdale, FL, USA

Silicon Valley HeadquartersSanta Clara, CA, USA

EMEA HeadquartersSchaffhausen, Switzerland

India Development CenterBangalore, India

Online Division HeadquartersSanta Barbara, CA, USA

Pacific HeadquartersHong Kong, China

Latin America HeadquartersCoral Gables, FL, USA

UK Development CenterChalfont, United Kingdom

About CitrixCitrix (NASDAQ:CTXS) is leadingthe transition to software-defining the workplace,unitingvirtualization, mobility management, networking and SaaS solutions to enable new waysfor businesses and peopleto work better. Citrix solutions power business mobility through secure,mobileworkspaces that provide people with instant access to apps, desktops, dataand communications on any device, over any network and cloud.With annual revenue in 2014 of $3.14 billion, Citrix solutions are in use at more than 330,000 organizations and by over 100 million users globally. Learn more atwww.citrix.com.

Copyright 2015 Citrix Systems, Inc. All rights reserved. Citrix, NetScaler, NetScaler App Delivery Controller and TriScale are trademarks of Citrix Systems, Inc. and/or one of its subsidiaries, and may be registered in the U.S. and other countries. Other product and company names mentioned herein may be trademarks of their respective companies.

About MirantisMirantis is the leading pure-play OpenStack company, creator of the highly-praised Mirantis OpenStack distribution. Mirantis is currently the #3 contributor to OpenStack core and has built more large-scale enterprise and service-provider OpenStack clouds than any other entity. Mirantis OpenStack incorporates a sophisticated pre-configuration and deployment tool (Fuel) that substantially automates creation of robust OpenStack clouds in High Availability (HA) configurations. Mirantis is a founding member of the OpenStack Foundation.

2015 Mirantis and the Mirantis logo are registered trademarks of Mirantis in the U.S. and other countries. Third party trademarks mentioned are the property of their respective owners.

Solutions Brief


13Using Mirantis OpenStack and Citrix NetScaler

ResourcesMirantis OpenStack Documentation (6.0) Mirantis Reference Architectures (for HA, Neutron-network, Ceph) Mirantis Bill of Materials Calculator OpenStack community documentation OpenStack Architecture Design Guide OpenStack Scaling

Using ADC Services to Build Openstack Clouds

Documents

Transcript of Using ADC Services to Build Openstack Clouds