Container Orchestration and Management Systems Comparison...

48
Container Orchestration and Management Systems Comparison from Technical View Harry Zhang, Member of #CNCF

Transcript of Container Orchestration and Management Systems Comparison...

Page 1: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

Container Orchestration and Management Systems

Comparison from Technical View Harry Zhang, Member of #CNCF

Page 2: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

The Scope of This Talk• Kubernetes

• by Cloud Native Computing Foundation

• Docker 1.12+ • by Docker Inc.

• Compose + Swarm is kind of legacy, so they will not be included in this talk

• Mesos • by Apache Software Foundation

• only with Marathon, DC/OS is not included (the scope of later is larger)

Page 3: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

Chapter 1: Core Idea and Architecture

Page 4: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

Kubernetes• Build right things with containers by following concepts and conventions

• like a “Spring Framework” in container eco-system • Design

• master • api-server, scheduler, controller-manager

• node • kubelet, kube-proxy

• independent binaries • Pros: modular, transparent, manageable • Cons: a little bit complex to setup (1.4 is much better now)

• network & volume plugins • driven by control loops

Page 5: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

kubeletSyncLoop

kubeletSyncLoop

proxy

proxy

1 Pod created

etcd

scheduler

api-server

Kubernetes

Page 6: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

kubeletSyncLoop

kubeletSyncLoop

proxy

proxy

2 Pod object added

etcd

scheduler

api-server

Kubernetes

Page 7: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

kubeletSyncLoop

kubeletSyncLoop

proxy

proxy

3.1 New pod object detected3.2 Bind pod with node

etcd

scheduler

api-server

Kubernetes

Page 8: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

kubeletSyncLoop

kubeletSyncLoop

proxy

proxy

4.1 Detected pod bind with me4.2 Start containers in pod

etcd

scheduler

api-server

Kubernetes

Page 9: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

kubeletSyncLoop

controller-managerControlLoop

kubeletSyncLoop

proxy

proxy

Objects:podreplicanamespaceserviceendpointjobdeployment volume petset …

etcd

scheduler

api-server

Reconcile:desired world VS real world

handler

Kubernetes

Page 10: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

Tips: Control Theory*

*Andrei, Neculai (2005). "Modern Control Theory – A historical Perspective"

• It’s the basic model for: • Kubernetes controller and all other event loops • SwarmKit orchestrator • …

ControlLoop

Page 11: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

Docker 1.12+• Build-in cluster support for Docker containers

• powered by swarmkit • SwarmtKit Design

• build-in data store • manager

• several components build into one binary • control loop driven

• worker • use pull model to connect with manager

WARNING: SwarmKit is currently a primitive project, expect change of this part

Page 12: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

Allocator

DispatcherScheduler

Orchestrator• API: accept commands from client

• Create object in raft based memory store

• github.com/coreos/etcd/raft for consensus

• github.com/hashicorp/go-memdb for in-memory object storage

• state, cluster, node, service, task, network …

$ docker service createAPI

Store

SwarmKit Manager

Page 13: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

Allocator

DispatcherScheduler

• Create Tasks from Service object

• Task: “start a container” etc

• Reconcile loop for Service objects

• Control Theory again

Orchestrator

API

Store

Orchestrator

Service (replica=2)

Task

Task

check if replica=2 or not

SwarmKit Manager

Page 14: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

• Allocates IP addresses to Services and Tasks

• (and allocate volumes in the future)

• VIP and ports for Service

• IP for all endpoints (veth pairs) in the network the task is attached to

Orchestrator

DispatcherScheduler

API

Store

Allocator

SwarmKit Manager

Network Create

Page 15: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

• Assign Task to Node

• unassignedTasks

• nodeHeap

• search in heap to find the best node which meets the constraints && has lightest workloads

• ReadyFilter, ResourceFilter, ConstraintFilter

Orchestrator

Dispatcher

API

Store

Scheduler

Allocator

SwarmKit Manager

Page 16: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

Manager

• Nodes (agents) management

• Dispatch assigned Task to corresponding Node

Orchestrator

API

Store

Allocator

SwarmKit Manager

Scheduler Dispatcher

Dispatcher

Agent

Agent

Agentgrpc stream

grpc stream

grpc stream

Task

Page 17: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

• Worker:

• connect to Dispatcher to check assigned tasks

• executor: execute tasks (containers) on this Node

Worker

Executor

Agent Agent

AdapterDocker Daemon

docker.sockWorker

Executor

Worker

Executor

Agent

Page 18: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

Mesos 1.0• A distributed systems kernel

• originally designed to run big data job • core idea: fine-grained resource sharing

• Mesos Design • Master + Slave + Zookeeper • two level scheduling

• scheduler + executor = framework • need to use frameworks like Marathon for orchestration and management

• containerizer • multiple container runtime & image support (>=1.0)

Page 19: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

MPI job

MPI scheduler

Hadoop job

Hadoop scheduler

Allocation module

Mesosmaster

Mesos slaveMPI

executor

Mesos slaveMPI

executor

tasktask

Resource offer

Pick framework to offer resources to

*Animate: Operating Systems and Systems Programming Lecture 24 Anthony D. Joseph https://cs162.eecs.berkeley.edu/

Page 20: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

MPI job

MPI scheduler

Hadoop job

Hadoop scheduler

Allocation module

Mesos master

Mesos slaveMPI

executor

Mesos slaveMPI

executor

tasktask

Pick framework to offer resources toResource

offer

Resource offer = list of (node, availableResources)

E.g. { (node1, <2 CPUs, 4 GB>), (node2, <3 CPUs, 2 GB>) }

*Animate: Operating Systems and Systems Programming Lecture 24 Anthony D. Joseph https://cs162.eecs.berkeley.edu/

Page 21: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

MPI job

MPI scheduler

Hadoop job

Hadoop scheduler

Allocation module

Mesos master

Mesos slaveMPI

executorHadoop executor

Mesos slaveMPI

executor

tasktask

Pick framework to offer resources to

taskFramework-specific

scheduling

Resource offer

Launches and isolates executors

*Animate: Operating Systems and Systems Programming Lecture 24 Anthony D. Joseph https://cs162.eecs.berkeley.edu/

Page 22: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

How Docker plug into Mesos?• Before 1.0

• Docker Containerizer • Docker image -> task -> mesos-docker-executor -> Docker Daemon

• Mesos 1.0 • Supporting multiple runtime & images • MesosContainerizer

• “Mesos native container stack” • Isolators • Launcher

Mesos slave

Hadoop executor

task

mesos-docker-executor

Page 23: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

Checkpoint

Kubernetes Docker SwarmKit Mesos+Marathon

Design control loops drivencontrol loops driven (but in single binary)

two level scheduling

Coordination etcd build-in raft Zookeeper

Container Runtime multiple single, but has potential for more OCI runtimes multiple

Container Image Docker Image, ACI, more in future Docker Image Docker Image, ACI, more

in future Docker Daemon no need need no need

Page 24: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

About Build-In Data Store

Pros Cons

easy to setup hard to understand & debug

fewer round trips hard to do backup/restore, migration, monitoring/audit

easy to do performance tuning lack of mgmt API like:etcd admin guide

Page 25: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

Chapter 2: Control Panel

Page 26: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

Control Panel: Orchestration + Management• “Defines when and what to do next through out the automated workflow”

• workload management • secret management • configuration management • scale and autoscaling • stateful workload • … and more

Page 27: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

Workload Managemente.g. “a web server with 2 replicas”

Kubernetes Docker SwarmKit Mesos+Marathon

Description Deployment Service Application

Version Control yes (revision) not yet yes (deployments)

Page 28: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

• Kubernetes “Deployment”• $ kubectl create -f <deployment-yaml> • $ kubectl edit <deployment>

• this will open and edit object stored in etcd • update will trigger rolling update

• $ kubectl set image <deployment> • $ kubectl scale —replicas=5 <deployment> … • $ kubectl rollout history <deployment> • $ kubectl rollout undo <deployment> —to-revision=<version>

$ kubectl edit <deployment> …

Page 29: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

• Docker SwarmKit “Service”• $ docker service create SERVICE —replicas=5 … • $ docker service scale SERVICE=REPLICAS • $ docker service update [OPTIONS] SERVICE

• rolling update • 30+ update options are supported

• —container-label-add value • —container-label-rm value • --env-add value • --env-rm value • —image string • …

Page 30: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

• Mesos + Marathon “Application”• $ dcos marathon app start [--force] <app-id> [<instances>] • $ dcos marathon app update [--force] <app-id> [<properties>…]

• rolling update • app dependencies are respected

• $ dcos marathon app version list [--max-count=<max-count>] <app-id> … • $ dcos marathon deployment list [--json <app-id>] • $ dcos marathon deployment rollback <deployment-id>

Page 31: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

Secret Management• Kubernetes

• Secret volume • encrypted and stored in etcd • consumed by ENV or volume

• Docker SwarmKit • under discussion: https://github.com/docker/swarmkit/issues/1329

• Mesos + Marathon • only in DC/OS

• stored in ZooKeeper, exposed as ENV in Marathon

• Another similar feature is Configuration Management

Page 32: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

Configuration Management• Kubernetes

• ConfigMap • stored in etcd, consumed by ENV or volume

• $ kubectl create configmap example-redis-config —from-file=docs/redis-config

• Docker SwarmKit • under discussion: https://github.com/docker/swarmkit/issues/1329

• Mesos + Marathon • not yet

Page 33: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

Autoscaling• Kubernetes

• HorizontalPodAutoScaler • default: CPU • Custom Metrics:

• user defined endpoint, e.g. http://localhost:9100/metrics • share same metric data structure with CNCF projects like Prometheus

• Docker SwarmKit • not yet: https://github.com/docker/swarmkit/issues/486#issuecomment-219133613

• Mesos + Marathon • a stand-by `marathon-autoscale.py` • autoscales application based on the utilization metrics from Mesos

Page 34: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

Stateful Workload• Kubernetes

• PetSet: Replicas with stable membership and volumes • stable hostname • ordinal index • stable storage

• Docker SwarmKit • not yet, and don’t suggest stateful service

• Mesos + Marathon • Stateful Applications

• dynamic reservations, reservation labels, and persistent volumes.

cassandra-0

volume 0

cassandra-0.cassandra.default.svc.cluster.local

cassandra-1

volume 1

cassandra-1.cassandra.default.svc.cluster.local

Page 35: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

Chapter 3: Service Discovery & Load Balance

Page 36: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

NodeNode

Service Discovery & LB• Kubernetes

• Load Balancer • iptables

• External Access • <externalIP route to node(s)>:<port> • NodePort: <ip of any node>:<NodePort> • External LoadBalancer • Ingress (L7)

• Ingress Pod: Nginx, HAproxy • SSL

• Name Service • build-in skyDNS pod

portal iptables rule 10.10.0.116:8001

random mode iptables rules

Pod 2Pod 1

ingress traffic http://foo.bar.com

Node

Ingress Pod

internal traffic

outside traffic

pod rule 2pod rule 1

Page 37: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

WorkerWorker

container sandbox

ingress sandbox

Service Discovery & Load Balance

• Docker SwarmKit • Load Balancer

• ipvs NAT mode

• External Access • Routing Mesh

• Name Service • embedded DNS server

• for service and task Container 2Container 1

ipvs

Gossip to update the iptables & ipvs rules

port mapping

iptables iptables

outside traffic (when service created with -p)

internal traffic

ipvs

• Two kinds of sandboxes

• ingress: on every worker

• container: on workers where task lives

• Two networks are needed

• ingress overlay • user-defined overlay

DNS: svc->vip

ingress sandbox

Page 38: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

Service Discovery & Load Balance• Mesos + Marathon

• Load Balancer • Marathon-lb: HAproxy based

• virtual addresses (VIPs) in DC/OS

• External Access • http://<public agent ip>:<servicePort> • external load balancer

• Name Service • Mesos-DNS

SlaveSlave

Marathon-lb

Container 2Container 1

Mesos-DNS

Page 39: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

Checkpoint

Kubernetes Docker SwarmKit Mesos+Marathon

Filter iptables VIP iptables VIP no need

LB iptables random mode ipvs NAT mode HAproxy

External Access nodeIP:port, Ingress, external IP/LB

Routing Mesh (ingress overlay)

same as expose HAproxy to public

Update watch etcd gossip marathon_lb.py & template

Page 40: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

Chapter 4: Scheduling

Page 41: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

Kubernetes• Pod as schedule unit

• this is unique, but why? • Multi-Scheduler

• pod1: scheduler1, pod2 : scheduler2 • QoS tiers

• anyone remember the core idea of Borg? • Guaranteed (requests == limit) • Burstable (requests < limit) • Best-Effort (no request & limit)

• More Borg features are on the way • equivalence class, pod level resource boundary …

Burstable Pod

Page 42: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

Docker SwarmKit• Task (container) as schedule unit • Multi-Scheduler

• not yet • Strategy

• pipeline of filters • ReadyFilter ResourceFilter ConstraintFilter

• to sort nodeHeap • QoS tiers

• not yet

Page 43: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

Mesos + Marathon• Task as schedule unit (Pod support in plan) • Multi-Scheduler

• Mesos is designed to run multiple frameworks (schedulers) • Strategy

• Two level scheduling (the killing weapon of Mesos) • Twitter scale … • fine-grained resource sharing (like Borg)

• QoS tiers • of course

• And much more • task eviction, data locality, max-min fairness, priority, offer reject, Delay Scheduling • and Big Data of course

Page 44: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

Chapter 5: Summary

Page 45: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

A Use Case: hyper.sh

• hyper.sh is “Docker Done the Right Way” • $ hyper run mysql • $ hyper run --link mysql wordpress • $ hyper fip attach 22.33.44.55 wordpress

• But Hyper.sh is powered by Kubernetes • and also maintain Kubernetes features

Page 46: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

Extensibility Really Matters• Hypernetes (h8s = k8s + HyperContainer) is what’s backing Hyper.sh:

• HyperContainer runtime • Multi-tenant network based on Neutron • Custom Cinder plugin with Ceph backend • Custom HAproxy based Service

• Kubernetes is truly extensible and configurable

Page 47: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

Just Personal Idea• So, if

• I am a individual developer/org, trying to find something that is friendly and just works

• I use Docker SwarmKit

• I have a “Twitter scale” cluster to manage or I am a Big Data user

• I need Mesos

• But if what I need is a infrastructure layer to build my systems on top of it in right way

• Kubernetes is the choice

Page 48: Container Orchestration and Management Systems Comparison ...schd.ws/hosted_files/cnkc16/77/CloudNativeCon_KubeCon_PPT.pdf · Container Orchestration and Management Systems Comparison

THE END @resouer