Kubernetes and OpenStack at Scale

28
KUBERNETES AND OPENSTACK AT SCALE Will it blend? Stephen Gordon (@xsgordon) Principal Product Manager, Red Hat May 8th, 2017

Transcript of Kubernetes and OpenStack at Scale

Page 1: Kubernetes and OpenStack at Scale

KUBERNETES AND OPENSTACK AT SCALE

Will it blend?

Stephen Gordon (@xsgordon)Principal Product Manager, Red Hat

May 8th, 2017

Page 2: Kubernetes and OpenStack at Scale

KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT2

ONCE UPON A TIME...Part 1

● 1000 OpenShift Container Platform 3.3 / Kubernetes 1.3 nodes on OpenStack infrastructure

● Presented methodology and results in Barcelona:○ https://www.cncf.io/blog/2016/08/23/deploying-1000-

nodes-of-openshift-on-the-cncf-cluster-part-1/● Goals were:

○ Push limits

○ Identify best practices

○ Document best practices

○ Fix issues

Page 3: Kubernetes and OpenStack at Scale

KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT3

FOR OUR NEXT TRICK!Part 2

● Goals:○ 2048 OpenShift Container Platform 3.5 / Kubernetes 1.5

nodes on OpenStack infrastructure○ Network ingress tier saturation test○ Overlay2 graph driver w/ SELinux test

○ Persistent volume scalability and performance test of

Container Native Storage (glusterfs)

Page 4: Kubernetes and OpenStack at Scale

KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT4

KUBERNETES SCALABILITY SIG

Scalability SIG SLAs:

● API responsiveness

○ 99% of calls return in < 1 s

● Pod startup time

○ 99% of pods start within 5s*

Also define a number of other primary and

derived metrics.

* With pre-pulled images

Page 5: Kubernetes and OpenStack at Scale

KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT5

A CONTAINER STACK FOR OPENSTACK

OPENSTACK KUBERNETES

+

A wild solution appears...

Consumption of resources

Able to easily access new environments to

quickly build new apps and move on

Exposition of resources

Provide necessary environments to developers

in minutes, not weeks or months

Page 6: Kubernetes and OpenStack at Scale

KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT6

A CONTAINER STACK FOR OPENSTACKA wild solution appears...

OPENSTACK OPENSHIFT

+

Consumption of resources

Integrated platform to run, orchestrate,

monitor, and scale containers. Built around

Kubernetes and Docker.

Exposition of resources

Provide necessary environments to developers

in minutes, not weeks or months

Page 7: Kubernetes and OpenStack at Scale

KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT7

CONCEPTUAL ARCHITECTURE

Architectural tenets:

● Technical

independence

● Contextual awareness

● Avoiding redundancy

● Simplified management

Reference architecture:

red.ht/2ibNmvX

Page 8: Kubernetes and OpenStack at Scale

PREPARATION

Page 9: Kubernetes and OpenStack at Scale

KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT9

WHERE TO TEST?

Page 10: Kubernetes and OpenStack at Scale

KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT1

0

HOW TO TEST?System Verification Test suite (SVT)

● Red Hat OpenShift Performance and Scalability team’s

upstream test suites:

○ Application Performance

○ Application Scalability

○ OpenShift Performance

○ OpenShift Scalability (incl. cluster-loader)

○ Networking Performance

○ Reliability/Longevity

● Also includes some additional tools e.g. image provisioner

● https://github.com/openshift/svt

Page 11: Kubernetes and OpenStack at Scale

KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT1

1

ARCHITECTUREBaremetal Cluster (100 nodes)

OpenShift-on-OpenStack Cluster (2048 nodes)

Page 12: Kubernetes and OpenStack at Scale

KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT1

2

ARCHITECTURE (cont.)

● Software:○ Red Hat OpenStack Platform 10, based on “Newton”○ OpenShift Container Platform 3.5 (built around K8S 1.5)○ Red Hat Enterprise Linux 7.3 (mostly…)

● Deployment:○ Deployed OpenStack + Ceph using TripleO○ Deployed OpenShift Container Platform using openshift-ansible.

● Applying previous learnings○ Storage architecture○ Image formatting○ Pre-baked images (see image_provisioner tool)

Page 13: Kubernetes and OpenStack at Scale

NETWORK INGRESS/ROUTING

Page 14: Kubernetes and OpenStack at Scale

KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT1

4

NETWORK INGRESS/ROUTING TIERTesting HAProxy Performance

● Load generator itself runs

in a pod.

● Added SNI and TLS variants

to the test suite.

● Configuration by passing in

configmaps.

● Focused in on HTTP with

keepalive and TLS

terminated at the edge.

projects:

- num: 1

basename: centos-stress

ifexists: delete

tuning: default

templates:

- num: 1

file: ./content/quickstarts/stress/stress-pod.json

parameters:

- RUN: "wrk" # which app to execute inside WLG pod

- RUN_TIME: "120" # benchmark run-time in seconds

- PLACEMENT: "test" # Placement of the WLG pods based on node label

- WRK_DELAY: "100" # maximum delay between client requests in ms

- WRK_TARGETS: "^cakephp-" # extended RE (egrep) to filter target routes

- WRK_CONNS_PER_THREAD: "1" # how many connections per worker thread/route

- WRK_KEEPALIVE: "y" # use HTTP keepalive [yn]

- WRK_TLS_SESSION_REUSE: "y" # use TLS session reuse [yn]

- URL_PATH: "/" # target path for HTTP(S) requests

Page 15: Kubernetes and OpenStack at Scale

KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT1

5

NETWORK INGRESS/ROUTING TIERTesting HAProxy Performance (cont.)

● 1p-mix-cpu*: nbproc=1, run on any CPU

● 1p-mix-cpu0: nbproc=1, run on core 0

● 1p-mix-cpu1: nbproc=1, run on core 1

● 1p-mix-cpu2: nbproc=1, run on core 2

● 1p-mix-cpu3: nbproc=1, run on core 3

● 1p-mix-mc10x: nbproc=1, run on any core,

sched_migration_cost=5000000

● 2p-mix-cpu*: nbproc=2, run on any core

● 4p-mix-cpu02: nbproc=4, run on core 2

Page 16: Kubernetes and OpenStack at Scale

NETWORK

Page 17: Kubernetes and OpenStack at Scale

KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT1

7

NETWORK PERFORMANCETesting OpenShift-sdn (OVS+VXLAN) Performance

● OpenShift includes and uses OpenShift-sdn (OpenvSwitch + VXLAN) by

default:

○ Provides full multi-tenancy

○ Is fully pluggable (as is ingress/routing tier)

○ Supports all four footprints (physical/virtual/private/public)

● Web-based workloads are mostly transactional

● Focused microbenchmark on a ping-pong test of varying payload sizes

Page 18: Kubernetes and OpenStack at Scale

KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT1

8

NETWORK PERFORMANCETesting OpenShift-sdn (OVS+VXLAN) Performance (cont.)

● Tested mix of payload sizes

and stream counts.

● tcp_rr-XXB-Yi

○ XX = # of bytes

○ Y = # of instances

(streams)

● Slimmed down version of

RFC2544

Page 19: Kubernetes and OpenStack at Scale

STORAGE

Page 20: Kubernetes and OpenStack at Scale

KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT2

0

OVERLAY2 w/ SELINUXNext on storage wars...

● Until recently RHEL used Device Mapper for docker’s storage graph driver

○ Overlay support added in RHEL 7.2

○ Overlay2 supported added in RHEL 7.3

○ Overlay2 support w/ SELinux added upstream and expected in RHEL 7.4

■ https://lkml.org/lkml/2016/7/5/409

○ Device Mapper remains default in RHEL for now, Overlay2 default in Fedora

26

■ https://fedoraproject.org/wiki/Changes/DockerOverlay2

● Let’s try it out!

Page 21: Kubernetes and OpenStack at Scale

KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT2

1

OVERLAY2 w/ SELINUXResults

● Single base

image for all

pods

● 240 pods on

the node

(rate limited

creation)

● Reasonable

memory

savings

Page 22: Kubernetes and OpenStack at Scale

KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT2

2

OVERLAY2 w/ SELINUXResults

Page 23: Kubernetes and OpenStack at Scale

KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT2

3

CONTAINER NATIVE STORAGEApproach

● OpenShift Container Platform supports a wide variety of volume providers

via the standard Kubernetes volume interface

● Red Hat Container Native Storage is a Gluster-based persistent volume

provider deployed on OpenShift

● Used the NVMe disks as “bricks” for Gluster, exposed 1G persistent

volumes

● Container Native Storage nodes marked unschedulable for other OpenShift

pods

● Ran throughput numbers for create/delete operations, as well as API

parallelism

Page 24: Kubernetes and OpenStack at Scale

KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT2

4

CONTAINER NATIVE STORAGEResults

● CNS allocated

volumes in constant

time

● Consistent with

results for other

persistent volume

providers

Page 25: Kubernetes and OpenStack at Scale

NEXT STEPS

Page 26: Kubernetes and OpenStack at Scale

KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT2

6

NEXT STEPSTo infinity, and beyond!

● Filed 40+ bugs across a variety of projects and components

● Scaling and Performance Guide, new with OpenShift Container Platform

3.5

● Getting Involved

○ “Kubernetes Ops on OpenStack” forum session

■ Wednesday, May 10, 1:50pm-2:30pm

■ Hynes Convention Center MR102

○ K8S SIG Scalability

○ K8S SIG OpenStack

Page 27: Kubernetes and OpenStack at Scale

KUBERNETES AND OPENSTACK AT SCALE #OPENSTACKSUMMIT #REDHAT2

7

REFERENCES

● Part 1: https://www.cncf.io/blog/2016/08/23/deploying-1000-nodes-of-

openshift-on-the-cncf-cluster-part-1/

● Part 2: https://www.cncf.io/blog/2017/03/28/deploying-2048-openshift-

nodes-cncf-cluster-part-2/

● Overlay2 and Device Mapper

https://developers.redhat.com/blog/2016/10/25/docker-project-can-

you-have-overlay2-speed-and-density-with-devicemapper-yep/

● Red Hat Performance and Scale Trello:

https://trello.com/b/M1bpo55E/scalability

Page 28: Kubernetes and OpenStack at Scale

THANK YOU

plus.google.com/+RedHat

linkedin.com/company/red-hat

youtube.com/user/RedHatVideos

facebook.com/redhatinc

twitter.com/RedHatNews