Graphite, an introduction

75
Graphite: An Introduction Scaling real-time monitoring

description

An introduction to graphite and why it's so great.

Transcript of Graphite, an introduction

Page 1: Graphite, an introduction

Graphite:An Introduction

Scaling real-time monitoring

Page 2: Graphite, an introduction

The purpose today

Page 3: Graphite, an introduction

What is graphite

Page 4: Graphite, an introduction

Why it’s so great

Page 5: Graphite, an introduction

How to graph(It’s really easy!)

Page 6: Graphite, an introduction

How we use graphite

Page 7: Graphite, an introduction

First, a definition

Page 8: Graphite, an introduction

Alerts+Metrics=Monitoring

Graphite Cacti Munin

NagiosIcinga

BothZenoss Hyperic ZabbixPNP4Nagios

Alerting Metrics

Page 9: Graphite, an introduction

What is graphite

Page 10: Graphite, an introduction
Page 11: Graphite, an introduction
Page 12: Graphite, an introduction

About graphite

● Django web application consisting of 3 parts:○ carbon (relays, caches, aggregates metrics)○ whisper (graphite’s equivalent of RRD files)○ Web UI (graph composer, simple dashboard)

Page 13: Graphite, an introduction

Why graphite?

Page 14: Graphite, an introduction

Why graphing?

Discover trends and patternsWhat time of the day do we get the most users?When x happened, what was the effect on y?How many hits am I getting per hour? How does this compare to last week? last month?

Predict future eventsWhen will we need to add more servers? Databases?

Negative feedbackDid the release into production fix problem x?

Page 15: Graphite, an introduction

Cacti SUCKS

A few reasons:

Ancient user interface (no javascript/ajax), terrible workflow, cannot push metrics, no

formulas, no graph introspection, cannot push metrics, cannot feed out of sequence

metrics, ugly graphs, no API, expose system/os metrics on host via snmp, no graph

composer, no custom graphs, predefine metrics, predefine graphs, static polling interval,

unscalable, tons of work to create one graph, no 3rd party ecosystem, etc.

Page 16: Graphite, an introduction

Graphite ++

Page 17: Graphite, an introduction

Simple

Page 18: Graphite, an introduction

Powerful

Page 19: Graphite, an introduction

Functions(sum, derivatives, integrals, timeshift, mostDeviant, scale,

averages, etc.)

Page 20: Graphite, an introduction

API(Nagios integration, 3rd party custom dashboards)

Page 21: Graphite, an introduction
Page 22: Graphite, an introduction

Scalable

Page 23: Graphite, an introduction

Easy to feed data

Page 24: Graphite, an introduction

Wide ecosystem of 3rd party tools and dashboards

http://graphite.readthedocs.org/en/latest/tools.html

Page 25: Graphite, an introduction

Tools

Page 26: Graphite, an introduction

StatsD

Page 27: Graphite, an introduction

Logster

Page 28: Graphite, an introduction

Skyline

Page 29: Graphite, an introduction

Collectd

Page 30: Graphite, an introduction

Dashboards

Page 31: Graphite, an introduction
Page 32: Graphite, an introduction
Page 33: Graphite, an introduction
Page 34: Graphite, an introduction
Page 35: Graphite, an introduction
Page 36: Graphite, an introduction

Graphite --

Page 37: Graphite, an introduction

No poller

Page 38: Graphite, an introduction

No all in one solution

Page 39: Graphite, an introduction

No easy backups

Page 40: Graphite, an introduction

It probably will become business critical

Page 41: Graphite, an introduction

How to graph

Page 42: Graphite, an introduction

There are tons of ways to feed graphite your data

Page 43: Graphite, an introduction

Bash

#!/bin/bash

timestamp = `date +%s`

value = 10

echo "dot.delimited.metric.name $value $timestamp" | nc -w 1 graphite.

host.name 2003

Python

def send_msg(message, HOST, PORT):

sock = socket.create_connection((HOST, PORT))

sock.send(message)

sock.close()

Python using graphite-pymetrics

from metrics import timing

@timing("heavy.task")

def heavy_task(x, y, z):

# do heavy stuff here

Page 44: Graphite, an introduction

Ruby

require 'socket'

Host = 'somegraphitehost'

conn = TCPSocket.new Host, 2003

conn.puts 'Metrics value timestamp'

conn.close

Java

import java.io.DataOutputStream;

import java.net.Socket;

Socket conn = new Socket("somegraphitehost" , 2003);

DataOutputStream dos = new DataOutputStream(conn .getOutputStream());

dos.writeBytes("metrics value timestamp" );

conn.close();

Page 45: Graphite, an introduction

How we use graphite

Page 46: Graphite, an introduction

700K + metrics per minute

Page 47: Graphite, an introduction

A Common Graphite Stack

Graphite-web

Collectd

Poller(s)

Applications

Carbon Whisper

Dashboards

Statsd

Scripts

Nagios

Page 48: Graphite, an introduction

Collectd

Agent for system/hardware level metricsGrowing repository of plugins for a wide variety of applications:

disk i/o, disk space, cpu, memory, mysql, JMX, java, Redis, file sizes, load, etc.https://collectd.org/wiki/index.php/Table_of_Plugins

Write your custom plugin in python

Page 49: Graphite, an introduction

Nagios integration

You can write Nagios plugins that can alert off of metrics valuesNagios can also feed graphite

performance data, events (ie: update counter each time email is sent), etc.

Page 50: Graphite, an introduction

What to collect?

Page 51: Graphite, an introduction

Hardware/OS metrics

Page 52: Graphite, an introduction

Load

Page 53: Graphite, an introduction

Disk space

Page 54: Graphite, an introduction

Disk I/O

Page 55: Graphite, an introduction

Network data

Page 56: Graphite, an introduction

Application metrics

Page 57: Graphite, an introduction

How often function x is called

Page 58: Graphite, an introduction

Average value of function x

Page 59: Graphite, an introduction

Average running time of function x

Page 60: Graphite, an introduction

Database/Datastore

Page 61: Graphite, an introduction

performance metrics

Page 62: Graphite, an introduction

number of records with value == ?

Page 63: Graphite, an introduction

number of slow queries

Page 64: Graphite, an introduction

Events

Page 65: Graphite, an introduction

Deployments

Page 66: Graphite, an introduction

send a 1, draw as infinite

Page 67: Graphite, an introduction

Log files

Page 68: Graphite, an introduction

http access logs (2xx, 3xx, 4xx, 5xx)

Page 69: Graphite, an introduction

Application logsException counts, results, important events, hits

Page 70: Graphite, an introduction

Final Musings

Page 71: Graphite, an introduction

Treat graphite like ‘Big Data’

Page 72: Graphite, an introduction

You don’t know what metrics you need until you need it

Page 73: Graphite, an introduction

Get Raid 10 SSD’s once you decide to scale

Page 74: Graphite, an introduction

More devopsy

Page 75: Graphite, an introduction

You can start graphing today!