Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS...

29
MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed Chief Architect [email protected]

Transcript of Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS...

Page 1: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple

MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES

FOR TELCOS

Azhar Sayeed Chief Architect

[email protected]

Page 2: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple

DISCLAIMER ImportantInforma+on

2

Theinforma+ondescribedinthisslidesetdoesnotprovideanycommitmentstoroadmapsoravailabilityofproductsorfeatures.Itsinten+onispurelytoprovideclarityindescribingtheproblemanddriveadiscussionthatcanthenbeusedtodriveopensourcecommuni+esRedHatProductManagementownstheroadmapandsupportabilityconversa+onforanyRedHatproduct

Page 3: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple

3

AGENDA

•  Background: OpenStack Architecture •  Telco Deployment Use case •  Distributed deployment – requirements •  Multi-Site Architecture

•  Challenges •  Solution and Further Study •  Conclusions

Page 4: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple
Page 5: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple

OPENSTACK ARCHITECTURE

Page 6: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple

6

WHY MULTI-SITE FOR TELCO? •  Computerequirements–NotjustatDataCenter

•  Mul+pleDataCenters•  ManagedServiceOffering

•  ManagedBranchOffice•  ThickvCPE

•  MobileEdgeCompute•  vRAN– vBBUloca+ons•  VirtualizedCentralOffices

•  Hundredstothousandsofloca+ons•  PrimaryandBackupDataCenter–Disasterrecovery•  IoTGateways–Fogcompu+ng

CentrallymanagedComputeclosertotheuser

Page 7: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple

7

Multiple DC or Central Offices

Security & Firewall Quality of Service (QoS) Traffic Shaping Device Management

Main Data Center

Overlay Tunnel over Internet

E2EOrchestratorRemoteSites•  HierarchicalConnec+vitymodelofCO•  Remotesiteswithcompute

requirements•  ExtendOpenStacktothesesites

Independent OpenStack Deployments

Backup Data Center

Remote Data Centers

Atypicalservicealmostalwaysspansacrossmul3pleDCs

Page 8: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple

8

Multiple DCs – NFV Deployment

L2 or L3 Extensions between DCs

Real Customer Requirements

Fully Redundant System

Controllers Storage Nodes Compute Nodes

Region1

Region2......25•  25Sites

•  2-5VNFsrequiredateachsite•  Maximumof2ComputeNodespersiteneededforthese

VNFs•  StorageRequirements=Imagestorageonly•  TotalnumberofcontrolNodes=25*3=75•  TotalNumberofStorageNodes=25*3=75•  TotalNumberofComputeNodes=25*2=50

RedundantConfigura+onOverhead

75%

Page 9: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple

9

Virtual Central Office

L2 or L3 Extensions between DCs

Real Customer Challenge

Fully Redundant System

Controllers Storage Nodes Compute Nodes

Region1Region2......1000+

•  1000+Sites–CentralOffices•  Fromfew10sto100sofVMs•  FullyRedundantconfigura+ons•  Termina+onofResiden+al,BusinessandMobileServices•  Managing1000openstackislands•  Tier1Telcosalreadyhave>100sitestoday

ManagementChallenge

Page 10: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple

DEPLOYMENT OPTIONS

10

Page 11: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple

OPTIONS •  Mul+pleIndependentIslandModel–seenthisalready•  CommonAuthen+ca+onandManagement

–  ExternaluserpolicymanagementwithLDAPintegra+on–  CommonKeystone

•  Stretcheddeploymentmodel–  ExtendcomputeandStorageNodesintootherDataCenters–  Keepcentralcontrolofallremoteresources

•  AllowDataCenterstoshareworkloads–Tri-circleapproach•  ProxytheAPIs–MasterSlavemodelorcascadingmodel•  Agentbasedmodel•  Somethingelse??

11

Page 12: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple

12

Multiple DC or Central Offices

L2 or L3 Extensions between DCs

Feedtheloadbalancer•  Sitecapacityindependentoftheother•  Userinforma+onseparateor

replicatedoffline•  Loadbalancerdirectstrafficwhereto

goto–Goodforloadsharing•  DR–externalproblem

Independent OpenStack Deployments

LB

Fully Redundant System

Fully Redundant System

Controllers Storage Nodes Compute Nodes

CloudManagementPladorm

Region1

Region2…NGoodforfew10sofsites–Whatabout100sorThousandsofsites

Directory

Page 13: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple

13

Extended OpenStack Model

L2 or L3 Extensions between DCs

CommonorSharedKeystone•  SingleKeystoneforauthen+ca+on•  Userinforma+oninoneloca+on•  IndependentResources•  Modifythekeystoneendpointtable

•  Endpoint,Service,Region,IP

Shared Keystone Deployment

Fully Redundant System

Fully Redundant System

Controllers Storage Nodes Compute Nodes

CloudManagementPladorm

Region1

Region2…N

Keystone

Iden+ty:Keystone–Singlepointofcontrol

Directory

Page 14: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple

14

Extended OpenStack Model

L2 or L3 Extensions between DCs

CentralController•  Singleauthen+ca+on•  DistributedComputeResources•  SingleAvailabilityZoneperRegion

Central Controller and Remote Compute & Storage (HCI) Nodes

Fully Redundant System

Controllers Storage Nodes Compute Nodes

CloudManagementPladorm

Region1 Region2…N

Replicated Storage – Galera Cluster

Cinder, Glance and Image

Manual Restore

Directory

Page 15: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple

15

Revisiting the Branch Office - Thick CPE

Enterprise vCPE x86 Server with VNFs

Data Center

Internet

Enterprise vCPE

NFVI

Security & Firewall Quality of Service (QoS) Traffic Shaping Device Management

OpenStack, OpenShift/Kubernetes

Can we deploy compute nodes at all the branch sites and centrally control them?

IPSec, MPLS or Other Tunnel mechanism

E2ENetworkOrchestrator

DeployNovaCompute

HowdoIscaleittothousandsofsites?

Page 16: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple

OSP 10 – Scale components independently Most OpenStack HA services and VIPs must be launched/managed by Pacemaker or HAProxy. However, some can be managed via systemctl thanks to the simplification of pacemaker constraints introduced in version 9 and 10.

Page 17: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple

17

COMPOSABLE SERVICES AND CUSTOM ROLES

•  Leverage composable services model –  to define a Central Keystone

–  Place functionality where it is needed – i.e. dis-aggregate

•  Deployable standalone on separate nodes or combined with other services into Custom Role(s).

–  Distribute the functionality depending on the DC locations

Hardcoded Controller Role

Custom Controller Role

Custom Ceilometer Role

Custom Networker Role

...

Keystone

Ceilometer

Neutron

RabbitMQ

Glance

Keystone

Ceilometer

Neutron

RabbitMQ

Glance

...

Page 18: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple

18

Re-visiting the Virtual Central Office use case

L2 or L3 Extensions between DCs

Real Customer Challenge

Fully Redundant System

Controllers Storage Nodes Compute Nodes

Region1

RequireFlexibilityandsomeHierarchy

Region2

Region3

Region4

Region3bRegion3a

Page 19: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple

Scaling across a thousand sites?

19

CONSIDERATIONS

•  Some areas that we need to look at •  Latency and Outage times

•  Delays due to distance between DCs and link speeds - RTT •  The remote site is lost – headless operations and subsequent

recovery

•  Startup Storms •  Scaling Oslo messaging

•  RabbitMQ

•  Scaling of Nodes => Scale RabbitMQ/Messaging •  Ceilometer (Gnocchi & Aodh)– heavy user of MQ

Page 20: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple

Scaling across a thousand sites?

20

LATENCY AND OUTAGE TIMES

•  Latencybetweensites–NovaAPICalls•  10,50,100ms?Roundtrip+me=Queuetuning•  Bojlenecklink/nodespeed

•  Outage+me–recovery+me

•  30sormore?•  NovaComputeservicesflapping•  Confirma+on–fromprovisioningtoopera+on•  Neutron+meouts–bindingissues•  Headlessopera+on•  Restart–causesstorms

Page 21: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple

21

RABBITMQ TUNING •  Tunethebuffers–increasebuffersize

•  Takeintoaccountmessagesinflight–ratesandroundtrip+mes•  BDP=Bojleneckspeed*RTT

•  Numberofmessages•  Servers*backends*requests/sec=Numberofmessages/sec

•  Splitintomul+pleinstancesofmessagequeuesfordistributeddeployment•  CeilometerintoaMQ–HeaviestuserofMQ•  NovaintoasingleMQ•  NeutronintoaMQ•  Refertoaninteres+ngpresenta+ononthistopic–“TuningRabbitMQ

atLargeScaleCloud”– OpenstackSummit–Aus+n2016

MQ

MQ

MQ

NovaConductor

Compute

Ceilometercollector

CeilometerAgents

Neutron

Page 22: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple

22

RECENT AMQP ENHANCEMENTS •  Eliminates the broker based model •  Enhances AMQP 1.0

•  Separate messaging end point from message routers

•  Newton has AMQP driver for oslo messaging •  Ocata provides perf tuning, upstream support for

Triple-O

•  If you must use RabbitMQ •  Use clustering and exchange configurations •  Use shovel plugin with exchange configurations

and multiple instances

Broker

Broker

Broker

Broker Broker

Hierarchical-Tree

Mesh-Routed

Page 23: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple

OPENSTACK CASCADING PROJECT

23

Parent

Child

Child

Child Child

ParentAZ1 AZn

ProxyforNova,Cinder,Celometer&NeutronsubsystemspersiteAtParent–loadsofproxysonesetperChildUsercommunicatestothemaster

Page 24: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple

Cascading solution split into two projects

24

TRICIRCLE AND TRIO2O

•  Tricircle – Networking across openstack clouds •  Trio2o – Single API Gateway for Nova, Cinder

ExpandworkloadsintootherOSinstancesCreateNetworkingextensionsIsola+onofEast-westtrafficApplica+onHA

APIGateway

User1 UserN

AZ1 AZx AZn

TRI-CIRCLEMakeNeutron(s)workasasinglecluster

Trio2o

OPNFVMul+-SiteProject– Eupheratesrelease

SingleRegionwithmul+plesubregionsSharedorFederatedKeystoneSharedorDistributedGlanceUID=TenantID+PODID

pod

Page 25: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple

Remote Compute Nodes

25

WHAT’S THE ALTERNATIVE?

•  Should we abandon the idea of Remote Nova Nodes? •  Use Packstack/AllinOne – OSP in a box – ala Vz uCPE

•  High overhead if you want to run 1-2 VNFs

•  Perhaps some optimization possible using Kolla/Container model

•  Initialize the remote nodes – Need L3/L2 connectivity for PXE •  Make that a Kubernetes Node – Use containers on that node

•  Implement a new interface for remote nodes

•  Nova Agent on remote nodes ?

•  Abandon the idea of OpenStack – No!!!! No OpenStack really!!! ?

•  Use a CMP – to manage remote bare metal nodes

•  KVM – Hypervisor

•  Run Containers on remote nodes – Do we run into same issues?

Page 26: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple

Virtual controllers – to get around node restrictions

26

VIRTUAL CONTROLLER MODEL

Kolla –Containerizing the control plane •  Kolla –Kubernetes and Kolla Ansible

•  Containerizing OSP control makes the previous options easier

•  Can remote nodes be considered as PODS in Kubernetes environments

•  Interface between Master and Host node •  The containers can be deployed on those nodes to manage apps or even OSP

services

Keystone Glance Nova

Neutron VM1 VM2

Page 27: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple

27

SUMMARY •  Deploying OpenStack at multiple sites is a must for Telcos •  Tri-circle and Trio2o offer good promise •  Tune Rabbit MQ or move to MQ enhancements (AMQP)

•  Partition MQ

•  Scale MQ instances •  Carefully craft the Availability Zone model •  Nova Agent Proxy •  Deploying baremetal at remote sites still an issue does not solve the

problem of access

•  Another way of automation using call home •  Use Kubernetes as master orchestrator => Kubernetes managing OSP

managing container workloads – K8S Sandwich

Page 28: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple

plus.google.com/+RedHat

linkedin.com/company/red-hat

youtube.com/user/RedHatVideos

facebook.com/redhatinc

twitter.com/RedHatNews

THANK YOU

Page 29: Multi-site OSP - OSS Boston · MULTI-SITE OPENSTACK DEPLOYMENT OPTIONS & CHALLENGES FOR TELCOS Azhar Sayeed ... A typical service almost always spans across mulple DCs 8 Multiple

ABSTRACT ImportantInforma+on

29

OpenStackprovidesagreatInfrastructure-as-a-Service(IaaS)pladromfordeploymentofapplica+onsinvirtualmachinesandcontainers.Fortelcosspecifically,OpenStackunifiesthepointofpresence(PoP),centraloffice,anddatacenterinfrastructure.However,manytelcosneedOpenStackdeployedinmanydatacentersaroundtheregionorcountry.Theques+onishowshouldtheydeployOpenStackformul+-siteneeds?Shouldtheyconsiderstretcheddeploymentwheredifferentcomponentssitindifferentloca+ons?Orshouldtheyconsiderreplica+ngtheen+reOpenStackenvironmentineachloca+on?WhatimpactdoesthishaveforKeystone,messaging,disasterrecovery,andmoreimportantly,unifiedmanagementofallthesesites?Thispresenta+onwilldiscussarchitecturalanddeploymentop+onsformul+-sitedeploymentsofOpenStack