Containers & Kubernetespeople.cs.umu.se/cklein/wasp-slides/Grzadkowski...Google Cloud Platform logo...

Post on 09-Jun-2020

10 views 0 download

Transcript of Containers & Kubernetespeople.cs.umu.se/cklein/wasp-slides/Grzadkowski...Google Cloud Platform logo...

Google Cloud Platform

logo

Containers & KubernetesUmea University2017/05/11

Filip Grzadkowski <filipg@google.com>Technical Leader & Engineering Manager@fgrzadkowski

So I have my app...

Old Way: Shared machines

kernel

libs

app

app app

app

app

libskernel

libs

app app

kernel

app

libs

libskernel

kernel

Old Way: Virtual machines

libs

app

kernel

libs

app

libs

app

libs

app

New Way: Containers

How?

Implemented by a number of (unrelated) Linux APIs:

• cgroups: Restrict resources a process can consume• CPU, memory, disk IO, ...

• namespaces: Change a process’s view of the system• Network interfaces, PIDs, users, mounts, ...

• capabilities: Limits what a user can do• mount, kill, chown, ...

• chroots: Determines what parts of the filesystem a user can see

Google Cloud Platform

Google has been developing and using containers to manage our applications for over 12 years.

Images by Connie Zhou

Everything at Google runs in containers:• Gmail, Web Search, Maps, ...• MapReduce, batch, ...• GFS, Colossus, ...• Even GCE itself: VMs in

containers

Everything at Google runs in containers:• Gmail, Web Search, Maps, ...• MapReduce, batch, ...• GFS, Colossus, ...• Even GCE itself: VMs in

containers

We launch over 2 billion containers per week.

Docker

Source: Google Trends

Now that we have containers...

Isolation: Keep jobs from interfering with each other

Scheduling: Where should my job be run?

Lifecycle: Keep my job running

Discovery: Where is my job now?

Constituency: Who is part of my job?

Scale-up: Making my jobs bigger or smaller

Auth{n,z}: Who can do things to my job?

Monitoring: What’s happening with my job?

Health: How is my job feeling?

Enter Kubernetes

Manage applications,not machines

Google confidential │ Do not distribute

Part 1 - Basics

Google confidential │ Do not distribute

App deployment

Pods

Example: data puller & web serverPod

File Puller Web Server

Volume

ConsumersContent Manager

Run my application.

Google confidential │ Do not distribute

App: MyApp

Phase: prod

Role: FE

App: MyApp

Phase: test

Role: FE

App: MyApp

Phase: prod

Role: BE

App: MyApp

Phase: test

Role: BE

Labels

Google confidential │ Do not distribute

App: MyApp

Phase: prod

Role: FE

App: MyApp

Phase: test

Role: FE

App: MyApp

Phase: prod

Role: BE

App: MyApp

Phase: test

Role: BE

App = MyApp

Selectors

Google confidential │ Do not distribute

App: MyApp

Phase: prod

Role: FE

App: MyApp

Phase: test

Role: FE

App: MyApp

Phase: prod

Role: BE

App: MyApp

Phase: test

Role: BE

App = MyApp, Role = FE

Selectors

Google confidential │ Do not distribute

App: MyApp

Phase: prod

Role: FE

App: MyApp

Phase: test

Role: FE

App: MyApp

Phase: prod

Role: BE

App: MyApp

Phase: test

Role: BE

App = MyApp, Role = BE

Selectors

Google confidential │ Do not distribute

Selectors

App: MyApp

Phase: prod

Role: FE

App: MyApp

Phase: test

Role: FE

App: MyApp

Phase: prod

Role: BE

App: MyApp

Phase: test

Role: BE

App = MyApp, Phase = prod

Google confidential │ Do not distribute

App: MyApp

Phase: prod

Role: FE

App: MyApp

Phase: test

Role: FE

App: MyApp

Phase: prod

Role: BE

App: MyApp

Phase: test

Role: BE

App = MyApp, Phase = test

Selectors

Google Cloud Platform

ReplicaSet ReplicaSet- name = “my-rc”- selector = {“App”: “MyApp”}- podTemplate = { ... }- replicas = 4

API Server

How many?

3

Start 1 more

OK

How many?

4

Run N copies of my application.

node 1

f0118

node 3

node 4node 2

d9376

b0111

a1209

Replication Controller- Desired = 4- Current = 4

ReplicaSet

node 1

f0118

node 3

node 4node 2

Replication Controller- Desired = 4- Current = 4

d9376

b0111

a1209

ReplicaSet

node 1

f0118

node 3

node 4

Replication Controller- Desired = 4- Current = 3

b0111

a1209

ReplicaSet

node 1

f0118

node 3

node 4

Replication Controller- Desired = 4- Current = 4

b0111

a1209

c9bad

ReplicaSet

Google confidential │ Do not distribute

Networking

Google confidential │ Do not distribute

10.1.1.0/24

172.16.1.1

172.16.1.2

Docker networking

10.1.2.0/24172.16.1.1

10.1.3.0/24172.16.1.1

Google confidential │ Do not distribute

10.1.1.0/24

172.16.1.1

172.16.1.2

Docker networking

10.1.2.0/24172.16.1.1

10.1.3.0/24172.16.1.1

NAT

NAT

NAT

NAT

NAT

Google confidential │ Do not distribute

Kubernetes networking

IPs are routable• vs docker default private IP

Pods can reach each other without NAT• even across nodes

No brokering of port numbers• too complex, why bother?

This is a fundamental requirement• can be L3 routed• can be underlayed (cloud)• can be overlayed (SDN)

Google confidential │ Do not distribute

10.1.1.0/24

172.16.1.1

172.16.1.2

Kubernetes networking

10.1.2.0/24172.16.1.1

10.1.3.0/24172.16.1.1

Services

IP / DNS

Client

Expose my application.

selector = {“App”: “MyApp”}

Google confidential │ Do not distribute

...and more!

More

• Resource isolation• L7 Load Balancing• ConfigMap• Secret• Deployment• Job• DaemonSet• PersistentVolumes• Rolling updates• DNS• Autoscaling

• Monitoring• Logging• Rkt, Hyper• Authorization• Dashboard• ...

Google confidential │ Do not distribute

Some takeaways

Google confidential │ Do not distribute

Modularity

Loose coupling is a goal everywhere• simpler• composable• extensible

Code-level plugins where possible

Isolate risk by interchangeable parts

Example: ReplicaSetExample: Scheduler

Google confidential │ Do not distribute

Control loops

Drive current state -> desired state

Act independently

APIs - no shortcuts or back doors

Example: ReplicaSet

observe

diff

act

Google confidential │ Do not distribute

Part 2 - auto-recovery

Google confidential │ Do not distribute

Autoscaling

Google confidential │ Do not distribute

Autoscaling

Pod Autoscaling Cluster Autoscaling

Google confidential │ Do not distribute

Autoscaling

Pod Autoscaling Cluster Autoscaling

Google Cloud Platform

HorizontalPodAutoscaler

Google Cloud Platform

Configuration spec: scaleTargetRef: kind: ReplicationController name: my-rc minReplicas: 2 maxReplicas: 10 targetCPUUtilizationPercentage: 70

$ kubectl autoscale rc my-rc --min=2 --max=10 --cpu-percent=70

Google Cloud Platform

How does it work?

Google Cloud Platform

More users

observes

Google Cloud Platform

More pods

resizes

Google Cloud Platform

Scaling algorithm

sum(utilization)

target utilizationceil( )

min <= target # instances <= max

target # instances =

Google Cloud Platform

Cluster Autoscaling

Google Cloud Platform

Overview

Node A Node B

?

Google Cloud Platform

Overview

Node A Node B Node C

Google Cloud Platform

Overview

Node A Node B Node C

Google Cloud Platform

Overview

Node A Node B Node C

Google Cloud Platform

Overview

Node A Node B Node C

Google Cloud Platform

Overview

Node A Node B

Google Cloud Platform

• Scale UP when pod is pending• Scale DOWN when node not needed for

long time• Simulations

Status: beta since v1.3

Overview

Google Cloud Platform

Machine failures

Google Cloud Platform

Machine failures

Node A Node B

Google Cloud Platform

Machine failures

Node A Node B

Google Cloud Platform

Machine failures

Node A Node B

Node Controller● Unresponsive

node

Google Cloud Platform

Machine failures

Node A Node B

Node Controller● Unresponsive

node

Google Cloud Platform

Machine failures

Node A Node B

Node Controller● Unresponsive

node

ReplicaSet Controller● desired: 4● actual: 3

Google Cloud Platform

Machine failures

Node A Node B

Node Controller● Unresponsive

node

ReplicaSet Controller● desired: 4● actual: 4

Google Cloud Platform

Machine failures

Node ATaint=BrokenDisk

Node B

Node Controller● BrokenDisk

Google Cloud Platform

Machine failures

Node ATaint=BrokenDisk

Node B

?

Google Cloud Platform

Machine failures

Node ATaint=BrokenDisk

Node B

Google Cloud Platform

Machine failures

Node ATaint=BrokenDisk

Node B

?Toleration=BrokenDisk

Google Cloud Platform

Machine failures

Node ATaint=BrokenDisk

Node B

Toleration=BrokenDisk

Google Cloud Platform

Is it useful?

Google Cloud Platform

AWS S3 outage

Realistically this kind of failure on a day like Tuesday could have resulted in hours of downtime. With Kubernetes, there was never a moment of panic, just a sense of awe watching the automatic mitigation as it happened.

Rob Scott,VP of Software Architecture @Spire

source

Google Cloud Platform

Kubernetes is Open

https://kubernetes.io

Code: github.com/kubernetes/kubernetesChat: slack.k8s.ioTwitter: @kubernetesio

open community

open design

open source

open to ideas