OSDC 2014: Devdas Bhagat - Graphite: Graphs for the modern age
description
Transcript of OSDC 2014: Devdas Bhagat - Graphite: Graphs for the modern age
GraphiteGraphs for the modern age
Graphite basics
● Graphite generates graphs from timeseries data– Think MRTG or Cacti
– More flexible than those
Graphite basics
● Graphite generates graphs from timeseries data– Think MRTG or Cacti
– More flexible than those
● Written in Python– This does impact performance
Graphite basics
● Graphite generates graphs from timeseries data– Think MRTG or Cacti
– More flexible than those
● Written in Python– This does impact performance
● Web based and easy to use– For once, not a marketing buzzword
The church of Graphs
● Pattern Recognition
The church of Graphs
● Pattern Recognition● Correlation
The church of Graphs
● Pattern Recognition● Correlation● Analytics
The church of Graphs
● Pattern Recognition● Correlation● Analytics● Anomaly detection
Helpful Graphite features
● Out of order data insertion
Helpful Graphite features
● Out of order data insertion● Ability to compare corresponding time periods
(time travel)
Helpful Graphite features
● Out of order data insertion● Ability to compare corresponding time periods
(time travel)● Custom retention periods
Moving parts
● Relays– Send data to correct backend store
Moving parts
● Relays– Send data to correct backend store
● Pattern matching on metric names● Consistent hashing
Moving parts
● Relays– Send data to correct backend store
● Pattern matching on metric names● Consistent hashing
● Storage– Flat, fixed size files
● These are created when the metric is first recorded● Changing later is hard
Moving parts
● Relays– Send data to correct backend store
● Pattern matching on metric names● Consistent hashing
● Storage– Flat, fixed size files
● These are created when the metric is first recorded● Changing later is hard
● Webapp– Django based application offering a web api and Javascript
based frontend application
Data output
● Web API
Data output
● Web API– Everything is a HTTP GET
– A number of functions for data manipulation
Data output
● Web API– Everything is a HTTP GET
– A number of functions for data manipulation
● Graphite offers outputs in multiple formats
Data output
● Web API– Everything is a HTTP GET
– A number of functions for data manipulation
● Graphite offers outputs in multiple formats– Graphical (PNG, SVG)
– Structured(JSON, CSV)
– Raw data
Using Graphite
● Custom pages pulling in PNG images– Just <img src=”some url here”>
Using Graphite
● Custom pages pulling in PNG images– Just <img src=”some url here”>
● Using the default frontend– For single, one off graphs
– Debugging problems
Using Graphite
● Custom pages pulling in PNG images– Just <img src=”some url here”>
● Using the default frontend– For single, one off graphs
– Debugging problems
● Using builtin dashboards– Users create their own dashboards
– Third part dashboard tools
Using Graphite
● Custom pages pulling in PNG images– Just <img src=”some url here”>
● Using the default frontend– For single, one off graphs
– Debugging problems
● Using builtin dashboards– Users create their own dashboards
– Third part dashboard tools
Using Graphite
● Custom pages pulling in PNG images– Just <img src=”some url here”>
● Using the default frontend– For single, one off graphs
– Debugging problems
● Using builtin dashboards– Users create their own dashboards– Third part dashboard tools
● Using third party libraries– JSON is nice for this
– Cubism, D3.js, rickshaw, etc
Using Graphite
● API– Monitoring
– Runtime performance tuning
Using Graphite
● API– Monitoring
– Runtime performance tuning
● Postmortem analytics
Using Graphite
● API– Monitoring
– Runtime performance tuning
● Postmortem analytics● Performance debugging
Making Graphite scale
● Original setup– Small cluster
● Two frontend boxes, two backend
Making Graphite scale
● Original setup– Small cluster
● Two frontend boxes, two backend
– RAID 1+0 with 4 spinning disks● This works well, with about 200 machines
Making Graphite scale
● Original setup– Small cluster
● Two frontend boxes, two backend
– RAID 1+0 with 4 spinning disks● This works well, with about 200 machines
– All those individual files force a lot of seeks
Scaling out - try 1
● Add more backend boxes
Scaling out - try 1
● Add more backend boxes– Manual rules to split traffic
– Pattern matching based on metric names
Scaling out - try 1
● Add more backend boxes– Manual rules to split traffic
– Pattern matching based on metric names
Scaling out - try 1
● Add more backend boxes– Manual rules to split traffic
– Pattern matching based on metric names● Balancing traffic is hard
Scaling up
● Replace spinning disks with SSDs
Scaling up
● Replace spinning disks with SSDs● Massive performance improvement due to
more IOPS– Still not as much as we needed
Scaling up
● Replace spinning disks with SSDs● Massive performance improvement due to
more IOPS– Still not as much as we needed
● Losing a SSD meant we had a box die– This has been fixed
Scaling up
● Replace spinning disks with SSDs● Massive performance improvement due to
more IOPS– Still not as much as we needed
● Losing a SSD meant we had a box die– This has been fixed
● SSDs are not as reliable as spinning rust– SSDs last for between 12 to 14 months
Sharding – take II
● At about 10 storage servers, manually maintaining regular expressions became painful
Sharding – take II
● At about 10 storage servers, manually maintaining regular expressions became painful
● Keeping disk usage balanced was even harder– Anyone is allowed to create graphs
Sharding - take II
● Replace regular expressions with consistent hashing
● Switch to RAID 0– We have switched back to RAID 1
● Store data on two nodes in each ring● Mirror rings in datacenters● Shuffle metrics to avoid losing data and disk
space.
Disk usage
● Graphite uses a lot of disk io– Background graph is in thousands on the Y axis.
– Individual files increase seek times
● There are a lot of stat(2) calls– This hasn't been investigated yet
Naming conventions
● Graphite has no rules for names
Naming conventions
● Graphite has no rules for names● We adopted:
– sys.* is for system metrics
– user.* is for testing/other stuff
– Anything else which makes sense is acceptable
Collecting metrics
● We have all sorts of homegrown scripts– Shell
– Perl
– Python
– Powershell
Collecting metrics
● We have all sorts of homegrown scripts– Shell
– Perl
– Python
– Powershell
● Originally used collectd for system metrics– The version of collected we were using had memory
usage issues● These have been fixed later
Collecting metrics
● System metrics are now collected by diamond
Collecting metrics
● System metrics are now collected by diamond● Diamond is a Python application
– Base framework + metric collection scripts
– Added custom patches for internal metrics
– Added patches to send monitoring data directly to Nagios for passive checks
Relay issues
● The Python relaying implementation eats CPU
Relay issues
● The Python relaying implementation eats CPU● Started with relays directly on the cluster
– Still need more CPU
Relay issues
● The Python relaying implementation eats CPU● Started with relays directly on the cluster
– Still need more CPU
● Added relays in each datacenter– Still need more CPU
Relay issues
● The Python relaying implementation eats CPU● Started with relays directly on the cluster
– Still need more CPU
● Added relays in each datacenter– Still need more CPU
● Ran multiple instances on each relay host– Still need more CPU
Relay issues
● The Python relaying implementation eats CPU● Started with relays directly on the cluster
– Still need more CPU
● Added relays in each datacenter– Still need more CPU
● Ran multiple instances on each relay host– Still need more CPU
● Finally rewrote in C and added more relay hosts– This works for us (and we have breathing room)
Data visibility
● We send data to multiple places– Metrics get dropped
Data visibility
● We send data to multiple places– Metrics get dropped
● Small application in Go which gets data from multiple locations and gives us a single merged resultset– Prototyped in Python, which was too slow
statsd
● We had statsd running, but unused for a long time– statsd use is still relatively small
– Only a few internal applications use it
– We already have an analytics framework for this
statsd
● We had statsd running, but unused for a long time– statsd use is still relatively small
– Only a few internal applications use it
– We already have an analytics framework for this
● The PCI vulnerability scanner reliably crashed it– This was patched and pushed upstream
Business metrics
● Turns out, developers like Graphite– They don't reliably understand whisper semantics
● Querying Graphite like SQL doesn't work
– They create a large number of named metrics● foo.bar.YYYY-MM-DD● Disk space use is a sudden concern
– Especially when you don't try and restrict this (feature, not bug)
Scaling out clusters
● Different groups have different requirements– Multiple backend rings, same frontend
● Unix systems● Windows● Networking● Business metrics● User testing
Current problems
● Hardware– Need more CPU
● Especially on the frontends where we do a lot of maths
– Better disk reliability on SSDs● Replacing disks is expensive
– More disk IO● SSDs are now maxed out under stat(2) calls● Testing Fusion IO cards
– 10% faster, but we don't know babout reliability yet
Current problems
● People– If you need a graph, put the data in Graphite
● Even if the data isn't time series data
● Frontend scalability– The default frontend doesn't work well with a few
thousand hosts
● Software upgrades– Our last Whisper upgrade caused data recording to
stop
Current problems
● Managability– Getting rid of older, non-required metrics is a lot of
effort
– Adding hosts into a ring requires manual rebalancing effort
Future possiilities
● Testing Cassandra as a backend (cyanite)● Anomaly detection
– Tested Skyline, didn't scale
● More business metrics● Sparse metrics
– Metrics with a lot of nulls, but potentially a lot of named metrics involved
Peopleware
● Hiring people to work on interesting challenges– Sysadmins, developers
– http://www.booking.com/jobs
● Booking.com will be sponsoring a Graphite dev summit in June (tentatively just before the devopsdays Amsterdam event)
Reference URLS● Graphite
– https://github.com/graphite-project
● Graphite API– http://graphite.readthedocs.org/en/latest/functions.html
● C Carbon relay– https://github.com/grobian/carbon-c-relay
● Zipper– https://github.com/grobian/carbonserver
● Cyanite– https://github.com/pyr/cyanite
– https://github.com/brutasse/graphite-cyanite
?