Monitoring Cassandra with Riemann

Post on 27-Aug-2014

394 views 4 download

Tags:

description

Monitor your Cassandra cluster using Riemann

Transcript of Monitoring Cassandra with Riemann

Real-time Monitoring with Riemann

Patricia Gorla@patriciagorlaCassandra Consultantwww.thelastpickle.com

OSCON

About Us

• Work with clients to deliver and improve Apache Cassandra services

• Apache Cassandra committer, Datastax MVP, Hector maintainer, Apache Usergrid committer

• Based in New Zealand & USA

Overview

Setting up Riemann

Simple Monitoring

Aggregate Monitoring

Monitoring a cluster

database Compactions Pending!Data Size!Read/Write Requests!Read/Write Latency!CF SSTable Size

database Compactions Pending!Data Size!Read/Write Requests!Read/Write Latency!CF SSTable Size

Read/Write Request Timeouts!Dropped Messages!Hints Stored

networking

database Compactions Pending!Data Size!Read/Write Requests!Read/Write Latency!CF SSTable Size

Read/Write Request Timeouts!Dropped Messages!Hints Stored

networking

JVM CMS Collection Time!ParNew Collection Time

org.apache.cassandra.metrics

• Codahale metrics library first introduced in CASSANDRA-3671 (v 1.1)

• Pluggable reporters introduced in CASSANDRA-4430 (v 2.0.2)

• Complete description in Apache documentation

Dropped MessagesCache

Commit LogCompactionConnection

Client RequestStorage

Riemann• Real-time

copyright Kyle Kingsbury, used with permission

Riemann• Real-time

• Streams are successions of filters

copyright Kyle Kingsbury, used with permission

Riemann• Real-time

• Streams are successions of filters

• Expressive programming language

copyright Kyle Kingsbury, used with permission

Monitoring Ecosystem

Download package from riemann.io

/etc/riemann/riemann.config

Modified version of https://github.com/addthis/metrics-reporter-config

Specify the metrics to send from Cassandra

cassandra-env.sh

Starting Cassandra

INFO [2014-07-15 21:44:50,712] pool-1-thread-1 - riemann.config - average event #riemann.codec.Event{ :host x.x.x.x, :service org.apache.cassandra.metrics ClientRequest Write Latency .95, :state “ok”, :description nil, :metric 19514.037499999988, :tags nil, :time 1405482287, :ttl nil }

INFO [2014-07-15 21:44:50,712] pool-1-thread-1 - riemann.config - average event #riemann.codec.Event{ :host x.x.x.x, :service org.apache.cassandra.metrics ClientRequest Write Latency .95, :state “ok”, :description nil, :metric 19514.037499999988, :tags nil, :time 1405482287, :ttl nil }

INFO [2014-07-15 21:44:50,712] pool-1-thread-1 - riemann.config - average event #riemann.codec.Event{ :host x.x.x.x, :service org.apache.cassandra.metrics ClientRequest Write Latency .95, :state “ok”, :description nil, :metric 19514.037499999988, :tags nil, :time 1405482287, :ttl nil }

INFO [2014-07-15 21:44:50,712] pool-1-thread-1 - riemann.config - average event #riemann.codec.Event{ :host x.x.x.x, :service org.apache.cassandra.metrics ClientRequest Write Latency .95, :state “ok”, :description nil, :metric 19514.037499999988, :tags nil, :time 1405482287, :ttl nil }

INFO [2014-07-15 21:44:50,712] pool-1-thread-1 - riemann.config - average event #riemann.codec.Event{ :host x.x.x.x, :service org.apache.cassandra.metrics ClientRequest Write Latency .95, :state “ok”, :description nil, :metric 19514.037499999988, :tags nil, :time 1405482287, :ttl nil }

package name class module

INFO [2014-07-15 21:44:50,712] pool-1-thread-1 - riemann.config - average event #riemann.codec.Event{ :host x.x.x.x, :service org.apache.cassandra.metrics ClientRequest Write Latency .95, :state “ok”, :description nil, :metric 19514.037499999988, :tags nil, :time 1405482287, :ttl nil }

/etc/riemann/riemann.config

• Flexible

• Simple to set up

• Keyboard shortcuts

riemann-dash

Edit graphs with keyboard shortcuts

Mean 95th Percentile Latency across all nodes

Mean 95th Percentile Latency across all nodes

Can also use fixed-event-window for less regular occurrences

Mean 95th Percentile Latency across all nodes++

Mean 95th Percentile Latency across all nodes++

Mean 95th Percentile Latency across all nodes++

(avg)

Chart in riemann-dash

(avg)

Chart in riemann-dash

(avg)

Detecting threshold breaches

Generalise which metrics to monitor

Call the custom function in the (streams …) section

Patricia Gorla@patriciagorlaCassandra Consultantwww.thelastpickle.com

OSCONQ&A