Monitoring and Log Management for

42
Monitoring and Log Management for Docker Swarm and Kubernetes Stefan Thies Sematext Group, Inc.

Transcript of Monitoring and Log Management for

Page 1: Monitoring and Log Management for

Monitoring and Log Management for

Docker Swarm and Kubernetes

Stefan Thies Sematext Group, Inc.

Page 2: Monitoring and Log Management for

Sematext & I

Logsene

SPM

logs

metrics

Docker Agent#nodejs

Page 3: Monitoring and Log Management for

Agenda

•What is • Centralized Log Management + Performance Monitoring• Kubernetes / Swarm

•Container Logs

•Container Metrics

•Example: #Swarm3k Monitoring

• Summary

Page 4: Monitoring and Log Management for

Centralized Log Management

LogagentLogagent

Page 5: Monitoring and Log Management for

Centralized Monitoring

Expose Metrics

Collect Metrics

Ship Metrics

Store Metrics

Aggregate Metrics

Visualize Metrics• Correlation

with Logs

Anomaly

Detection

Alerting

Server +App / ContainerConfiguration

Monitoring Agents

Time SeriesDatabase

Dashboard Tools, Alerting Tools, ChatOps Tools

Page 6: Monitoring and Log Management for

https://sematext.com/blog/2016/07/19/open-source-docker-monitoring-logging/

Page 7: Monitoring and Log Management for

Orchestration

Container

POD

Node Node 1

POD 1

Namespace ns1

Kibana Elasticsearch

POD 2

Namespace ns2

Redis

Services (proxy)

Replication Controllers

DaemonSets

3

HorizontalPodAutoscaler

Page 8: Monitoring and Log Management for

Kubernetes Dashboard / Heapster

•Current status

• Shows basic resource usagefor workloads (Pod)

• Simple logs view

•Heapster is required for autoscaling features

Page 9: Monitoring and Log Management for

Orchestration

Container

Stacks

Nodes Node 1

ELK

(compose, app bundle)

Kibana 1 Elasticsearch 1

Redis

(service)

redis1

3

Node 2

ELK

Elasticsearch 2

Elasticsearch 3

Page 10: Monitoring and Log Management for

Kubernetes != Swarm

•Common base is Docker• Docker Logs & Metrics• Docker API

Page 11: Monitoring and Log Management for

Container Logs

Page 12: Monitoring and Log Management for

Docker Logging DriversD

ock

erjson-file (default) Files

journald (CoreOS) System journal

Syslog

TCP

UDP

Fluentd TCP

$plunk TCP

Gelf

Centralized Log Management

Local Log Shipper

Page 13: Monitoring and Log Management for

Docker logs

Containers (should) log to stdout/stderr !!!

docker logs container_id

docker logs container_name

Docker

API

Docker

client

Container logs

Page 14: Monitoring and Log Management for

Fun with Docker logging drivers

$ docker run --log-driver=syslog --log-opt syslog-address=udp://$HOSTNAME:514 --log-opt tag=„{{.ImageName}}#{{.Name}}#{{.ID}}" -p 9003:80 –name nginx1 -d nginx

$ docker logs nginx 1

"logs" command is supported only for "json-file" and "journald" logging drivers (got: syslog)

Add Context!

Page 15: Monitoring and Log Management for

More fun with TCP logging drivers!

docker run --log-driver=syslog --log-opt syslog-address=tcp://127.0.0.1:514 --log-opt tag="{{.ImageName}}#{{.Name}}#{{.ID}}" -p 9004:80 -d nginx

docker: Error response from daemon: Failed to initialize logging driver: dial tcp 127.0.0.1:514: getsockopt: connection refused.

Page 16: Monitoring and Log Management for

Fix it – run syslog server first!

docker run -d -p 514:514 factorish/syslog -t tcp

docker run –logging-driver=syslog … nginx

curl localhost:9004

docker logs syslog

==> syslog listening on tcp

<30>Nov 17 18:23:43 nginx#nginx1#afebdfff0eed[1710]: 172.17.0.1 - - [17/Nov/2016:18:23:43 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.49.1" "-"

Page 17: Monitoring and Log Management for

Is UDP better?

Page 18: Monitoring and Log Management for

Alternatives?

DockerLog files

json-file orjournald

API

Agent

Remote Log Storage

Disk Buffer

Docker API provides

the most complete information!

Reliable networks and backend services?

Better buffer & retransmit in case of failure!

Attach metadata to logs/metrics or

route data to different servers

or indices?

“docker logs” works & logs are stored on local

disk!

Centralize search, analytics, alerts,

access permissionsParse logs

Page 19: Monitoring and Log Management for

Automatic tagging of logs, metrics, events

• Automatic tagging of log / metrics with• Docker

• Container Name / ID• Image Name / ID• Labels / Environment• Hostname / IP

• Kubernetes• Namespace, Pod Name , UID

• Swarm• Swarm Service Name , ID, Compose Project, Container # (scale)

• Single collector for logs, metrics, events, metadata

• Base for correlation and visualisation

Page 20: Monitoring and Log Management for

Container Metrics Collection

Page 21: Monitoring and Log Management for

Collection

Page 22: Monitoring and Log Management for

Metric collection via Docker API

Page 23: Monitoring and Log Management for

Smart monitoring agent - all in one

Docker

API

Agent

Remote Storage

Disk Buffer

Docker API provides

Labels, Metrics, Logs, Events …

Reliable networks and backend services?

Better buffer & retransmit in case of failure!

Auto-tagging using container

labels.Discovery of

services Centralize logs, metrics, analytics, alerts, access

permissions

Metrics, Logs,

Events

Page 24: Monitoring and Log Management for

Integrate application monitoring in the stack

-Custom images- add/remove

app with all req. options

- Start monitoring, reading config from etcd

AppConfig to expose

metrics

App MonitorConfigured for App

Container

Service Discoveryetcd

consul

Page 25: Monitoring and Log Management for

Auto Discovery via Docker API and Labels?

App Containerconfig to expose

metrics

App MonitorDocker Monitor

run

disco

very

Docker

Automatic run

Page 26: Monitoring and Log Management for

Key Container Metrics

Page 27: Monitoring and Log Management for

Node Storage

•Good kids clean up their rooms. Good Docker ops clean up their disks by removing unused containers & images.

Page 28: Monitoring and Log Management for

Number of containers per host

•Verify deployment strategies

Page 29: Monitoring and Log Management for

CPU quota per container

Page 30: Monitoring and Log Management for

Container memory and OOM counter

Page 31: Monitoring and Log Management for

Docker Events

Page 32: Monitoring and Log Management for

Swarm Task Status

Page 33: Monitoring and Log Management for

Limit container resources for your apps!

• Set CPU quotas cpu-quota=6000• Limit Memory and configure App in container to the same limits! m 512mb•Disable Swap: memory-swap=0•To limit a Docker container from eating all your disk IO use

e.g. device-write-bps /dev/sda:1mb

Page 34: Monitoring and Log Management for

Automatic Deployment of monitoring agents

• One command to run a service on each node joining the cluster

• Kubernetes: • DaemonSet creates a pod per

node kubectl create -f sematext-agent.yml

• Swarm: • Global Service docker service create –mode global ...

Page 35: Monitoring and Log Management for

Swarm3k Monitoring

Page 36: Monitoring and Log Management for

Swarm3k Requirements

•Monitoring • Host metrics • Container metrics• Docker Events• Task Monitoring

•Collect Container Logs: Task Errors only

•3000+ Nodes (actual: 4.7k)

•150.000 (actual: 60k)

•Duration 8 hours – 28 GB data collected

•Public/shared Dashboard for the community

Page 37: Monitoring and Log Management for

Pre-flight test with 500 nodes

•60.000 containers deployed in less than 5 minutes!

Page 38: Monitoring and Log Management for

Swarm3k in one picture

Page 39: Monitoring and Log Management for

Limits in visualisation

Missing Labels to group hosts or

containers

Page 40: Monitoring and Log Management for

Summary

• Setup of Monitoring & Logging is complex in dynamic environments

•Kubernetes != Swarm (yet). Common base: Docker Containers

• Smart Agents to collect, analyze, aggregate metrics, events and logs• Auto discovery of containers for data collection• Use metadata tag metrics & logs as base for correlation and visualization• Integrate monitoring in application stacks for app specific metrics • Auto Discovery of services and automatic configuration for application level

monitoring

Page 41: Monitoring and Log Management for

Join US!Join US!

We are engineers!

We develop DevOps tools!

We are DevOps people!

We do fun stuff ;)

Join US!

join us

http://sematext.com/jobs

is hiring!

Page 42: Monitoring and Log Management for

Thank you for listening! Get in touch!

Join US!Join US!

Stefan [email protected]@seti321

http://sematext.com@sematext

Join US!

join us

http://sematext.com/jobs

Come talk to us at the booth