Metrics with Ganglia

41
gareth rushgrove | morethanseven.net Collecting Metrics With Ganglia and Friends Cambridge Geek Night 28th March 2011 http://www.flickr.com/photos/memestate/45986749

description

Talk about using Ganglia and other tools for storing all kinds of web application metrics for both operations and business purposes. Presented at Cambridge Geek Night

Transcript of Metrics with Ganglia

Page 1: Metrics with Ganglia

gareth rushgrove | morethanseven.net

Collecting MetricsWith Ganglia and Friends

Cambridge Geek Night 28th March 2011

http://www.flickr.com/photos/memestate/45986749

Page 2: Metrics with Ganglia

Gareth Rushgrove

gareth rushgrove | morethanseven.net

Page 3: Metrics with Ganglia

Work at FreeAgent

gareth rushgrove | morethanseven.net

freeagentcentral.com

Page 4: Metrics with Ganglia

Blog at morethanseven.net

gareth rushgrove | morethanseven.net

Page 5: Metrics with Ganglia

Curate devopsweekly.com

gareth rushgrove | morethanseven.net

Page 6: Metrics with Ganglia

Covering (Business Version)

gareth rushgrove | morethanseven.net

- Capacity planning metrics

- Metrics for your application- Business analytics

- Having everything in one place

Page 7: Metrics with Ganglia

Covering (Tech Version)

gareth rushgrove | morethanseven.net

- Ganglia Store metrics and view graphs

- Logster Get log files into Ganglia

- Gmetric Get anything into Ganglia

- Syslog Using Loggly to view individual log items

Page 8: Metrics with Ganglia

Everyone Uses Something Like?

gareth rushgrove | morethanseven.net

Page 9: Metrics with Ganglia

Use Something Like This Too

gareth rushgrove | morethanseven.net

Page 10: Metrics with Ganglia

What is Ganglia?

gareth rushgrove | morethanseven.net

Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids.ganglia.sourceforge.net

Page 11: Metrics with Ganglia

Example: vagrantbox.es

gareth rushgrove | morethanseven.net

Page 12: Metrics with Ganglia

Load Averages

gareth rushgrove | morethanseven.net

Page 13: Metrics with Ganglia

CPU

gareth rushgrove | morethanseven.net

Page 14: Metrics with Ganglia

Aggregate Graphs

gareth rushgrove | morethanseven.net

Page 15: Metrics with Ganglia

Across Entire Cluster

gareth rushgrove | morethanseven.net

Page 16: Metrics with Ganglia

Predicting When Your System Will Fail

gareth rushgrove | morethanseven.net

A strategy for anticipating future workloads of your computers, with the aim of creating a computing environment that can handle future workloadIBM

Page 17: Metrics with Ganglia

Disk Space

gareth rushgrove | morethanseven.net

Page 18: Metrics with Ganglia

Monitoring Your Application

gareth rushgrove | morethanseven.net

Page 19: Metrics with Ganglia

86.26.7.33 - - [26/Mar/2011:20:39:52 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.1" 200 2081 "-" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_7; en-us) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"

Web Server Logs

gareth rushgrove | morethanseven.net

Page 20: Metrics with Ganglia

Logster from Etsy

gareth rushgrove | morethanseven.net

Page 21: Metrics with Ganglia

Tail a log file and filter each line to generate metrics that can be sent tocommon monitoring packages.

Options: -p METRIC_PREFIX, --metric-prefix=METRIC_PREFIX Add prefix to all published metrics. This is for people that may multiple instances of same service on same host. --gmetric-options=GMETRIC_OPTIONS Options to pass to gmetric such as -d 180 -c /etc/ganglia/gmond.conf (default). These are passed directly to gmetric. --graphite-host=GRAPHITE_HOST Hostname and port for Graphite collector, e.g. graphite.example.com:2003 -s STATE_DIR, --state-dir=STATE_DIR Where to store the logtail state file. Default location /var/run -d, --dry-run Parse the log file but send stats to standard output. -D, --debug Provide more verbose logging for debugging.

Logster

gareth rushgrove | morethanseven.net

Page 22: Metrics with Ganglia

logster SampleGangliaLogster /../access.log

Logster Command Line

gareth rushgrove | morethanseven.net

Page 23: Metrics with Ganglia

HTTP Responses with a 2xx Status Code

gareth rushgrove | morethanseven.net

Page 24: Metrics with Ganglia

The Ganglia Metric Client (gmetric) announces a metricon the list of defined send channels defined in a configuration file

Usage: gmetric [OPTIONS]... -V, --version Print version and exit -c, --conf=STRING The configuration file to use for finding send channels (default='/etc/ganglia/gmond.conf') -n, --name=STRING Name of the metric -v, --value=STRING Value of the metric -t, --type=STRING Either string|int8|uint8|int16|uint16|int32|uint32|float|double -u, --units=STRING Unit of measure for the value e.g. Kilobytes, Celcius (default='') -s, --slope=STRING Either zero|positive|negative|both (default='both') -x, --tmax=INT The maximum time in seconds between gmetric calls (default='60') -d, --dmax=INT The lifetime in seconds of this metric (default='0') -S, --spoof=STRING IP address and name of host/device (colon separated) we are spoofing (default='') -H, --heartbeat spoof a heartbeat message (use with spoof option)

Gmetric

gareth rushgrove | morethanseven.net

Page 25: Metrics with Ganglia

Gmetric Scripts for Common Applications

gareth rushgrove | morethanseven.net

Page 26: Metrics with Ganglia

gmetric -n sales -v 200 -t float

Gmetric Command Line

gareth rushgrove | morethanseven.net

Page 27: Metrics with Ganglia

Our Custom Metric in Ganglia

gareth rushgrove | morethanseven.net

Page 28: Metrics with Ganglia

import subprocess

from bottle import route, run, abort, default_app

@route('/:name/:value')def index(name, value): try: cmd = 'gmetric -n %s -v %s -t float' % (name, value) subprocess.check_call( cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) return "Success: %s" % cmd except subprocess.CalledProcessError: abort(500, "Error")

app = default_app()

Gmetric HTTP Interface

gareth rushgrove | morethanseven.net

Page 29: Metrics with Ganglia

http://../sales/200

Gmetric URL

gareth rushgrove | morethanseven.net

Page 30: Metrics with Ganglia

import subprocessimport SocketServer

class GmetricTCPHandler(SocketServer.BaseRequestHandler):

def handle(self): self.data = self.request.recv(1024).strip() items = self.data.split(' ') try: cmd = 'gmetric -n %s -v %s -t float' % (items[0], items[1]) subprocess.check_call( cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) return "Success: %s" % cmd except Exception: return "Error"

if __name__ == "__main__": HOST, PORT = "0.0.0.0", 8001 server = SocketServer.TCPServer((HOST, PORT), GmetricTCPHandler) server.serve_forever()

Gmetric TCP Interface

gareth rushgrove | morethanseven.net

Page 31: Metrics with Ganglia

sales 200

Gmetric TCP

gareth rushgrove | morethanseven.net

Page 32: Metrics with Ganglia

Syslog

gareth rushgrove | morethanseven.net

Syslog is a standard for logging program messages. It allows separation of the software that generates messages from the system that stores them and the software that reports and analyzes them.Wikipedia

Page 33: Metrics with Ganglia

Loggly - Logging as a Service

gareth rushgrove | morethanseven.net

Page 34: Metrics with Ganglia

View logs

gareth rushgrove | morethanseven.net

Page 35: Metrics with Ganglia

Logstash

gareth rushgrove | morethanseven.net

Page 36: Metrics with Ganglia

Graylog2

gareth rushgrove | morethanseven.net

Page 37: Metrics with Ganglia

Other Things You Could Monitor

gareth rushgrove | morethanseven.net

- Database table sizes

- Cache hits- Time taken for test runs

- Codebase size

- Signups, sales, subscriptions

- Twitter followers

Page 38: Metrics with Ganglia

What Next?

gareth rushgrove | morethanseven.net

- Wikipedia http://ganglia.wikimedia.org/

- Install Ganglia deb and rpm packages available

- Add system metrics web servers, databases

- Add business metrics users, sales, tweets

- Try Loggly or at least investigate syslog

Page 39: Metrics with Ganglia

gareth rushgrove | morethanseven.net

Reading

Page 40: Metrics with Ganglia

CBGN11

2 months free on FreeAgent

gareth rushgrove | morethanseven.net

Page 41: Metrics with Ganglia

Questions?

gareth rushgrove | morethanseven.net http://flickr.com/photos/psd/102332391/