Download - Surfing the event stream

Transcript
Page 1: Surfing the event stream

@samnewman#geecon

Surfing The Event StreamSam Newman

ThoughtWorks

Sunday, 21 July 13

Page 2: Surfing the event stream

@samnewman#geeconSunday, 21 July 13

Page 3: Surfing the event stream

@samnewman#geeconSunday, 21 July 13

Page 4: Surfing the event stream

@samnewman#geecon

Operational Data

Sunday, 21 July 13

Page 5: Surfing the event stream

@samnewman#geecon

Operational Data

CPU

Sunday, 21 July 13

Page 6: Surfing the event stream

@samnewman#geecon

Operational Data

CPU Memory Use

Sunday, 21 July 13

Page 7: Surfing the event stream

@samnewman#geecon

Operational Data

CPU Memory Use

Threads

Sunday, 21 July 13

Page 8: Surfing the event stream

@samnewman#geecon

Operational Data

CPU

Disk IO

Memory Use

Threads

Sunday, 21 July 13

Page 9: Surfing the event stream

@samnewman#geecon

Collection & Display

• sar

• syslog

• collectd

• syslog-ng

• nagios

• ganglia

Sunday, 21 July 13

Page 10: Surfing the event stream

@samnewman#geecon

Server

Server

Server

Server

Sunday, 21 July 13

Page 11: Surfing the event stream

@samnewman#geecon

Server

Server

Server

Server

Sunday, 21 July 13

Page 12: Surfing the event stream

@samnewman#geecon

Server

Server

Server

Server

Sunday, 21 July 13

Page 13: Surfing the event stream

@samnewman#geecon

Server

Server

Server

Server

Sunday, 21 July 13

Page 14: Surfing the event stream

@samnewman#geecon

Business Data

Sunday, 21 July 13

Page 15: Surfing the event stream

@samnewman#geecon

Business Data

Orders Placed

Sunday, 21 July 13

Page 16: Surfing the event stream

@samnewman#geecon

Business Data

Orders Placed Revenue

Sunday, 21 July 13

Page 17: Surfing the event stream

@samnewman#geecon

Business Data

Orders Placed Revenue

Fraud Cases

Sunday, 21 July 13

Page 18: Surfing the event stream

@samnewman#geecon

Business Data

Orders Placed

Bounce Rate

Revenue

Fraud Cases

Sunday, 21 July 13

Page 19: Surfing the event stream

@samnewman#geecon

How did we handle them?

• Google Analytics

• Data Warehouse Systems

• Log files!

Sunday, 21 July 13

Page 20: Surfing the event stream

@samnewman#geecon

Something Happened!

Sunday, 21 July 13

Page 21: Surfing the event stream

@samnewman#geecon

Something Happened!

What Should We Do?

Sunday, 21 July 13

Page 22: Surfing the event stream

@samnewman#geecon

Something Happened!

What Should We Do?

Sunday, 21 July 13

Page 23: Surfing the event stream

@samnewman#geecon

Something Happened!

What Should We Do?

Sunday, 21 July 13

Page 24: Surfing the event stream

@samnewman#geeconSunday, 21 July 13

Page 25: Surfing the event stream

@samnewman#geecon

http://blog.jgc.org/2006/05/what-slashdot-effect-looks-like.html

Sunday, 21 July 13

Page 26: Surfing the event stream

@samnewman#geeconSunday, 21 July 13

Page 27: Surfing the event stream

@samnewman#geecon

Fast

Sunday, 21 July 13

Page 28: Surfing the event stream

@samnewman#geecon

Fast

And Easy...

Sunday, 21 July 13

Page 29: Surfing the event stream

@samnewman#geecon

Fast

And Easy...

At Scale

Sunday, 21 July 13

Page 30: Surfing the event stream

@samnewman#geecon

Aggregation Is Key

Sunday, 21 July 13

Page 31: Surfing the event stream

@samnewman#geecon

Mark McGranaghan: "Logs as Data"

http://blip.tv/clojure/mark-mcgranaghan-logs-as-data-5953857

Sunday, 21 July 13

Page 32: Surfing the event stream

@samnewman#geecon

Paul Ingles: "Users as Data"

http://vimeo.com/45136211

Sunday, 21 July 13

Page 33: Surfing the event stream

@samnewman#geecon

Log Stash + Graylog2

Sunday, 21 July 13

Page 34: Surfing the event stream

@samnewman#geecon

Log Stash + Graylog2

Sunday, 21 July 13

Page 35: Surfing the event stream

@samnewman#geecon

Log Stash + Graylog2

Sunday, 21 July 13

Page 36: Surfing the event stream

@samnewman#geecon

Log Stash + Graylog2

Sunday, 21 July 13

Page 37: Surfing the event stream

@samnewman#geeconSunday, 21 July 13

Page 38: Surfing the event stream

@samnewman#geecon

Graphite

Sunday, 21 July 13

Page 39: Surfing the event stream

@samnewman#geeconSunday, 21 July 13

Page 40: Surfing the event stream

@samnewman#geeconSunday, 21 July 13

Page 41: Surfing the event stream

@samnewman#geecon

www01.cpuUsage 42 1286269200

Sunday, 21 July 13

Page 42: Surfing the event stream

@samnewman#geeconSunday, 21 July 13

Page 43: Surfing the event stream

@samnewman#geeconSunday, 21 July 13

Page 44: Surfing the event stream

@samnewman#geeconSunday, 21 July 13

Page 45: Surfing the event stream

@samnewman#geeconSunday, 21 July 13

Page 46: Surfing the event stream

@samnewman#geecon

???

Sunday, 21 July 13

Page 47: Surfing the event stream

@samnewman#geeconSunday, 21 July 13

Page 48: Surfing the event stream

@samnewman#geeconSunday, 21 July 13

Page 49: Surfing the event stream

@samnewman#geecon

Graphite

Sunday, 21 July 13

Page 50: Surfing the event stream

@samnewman#geecon

Graphite

Server

collectd

Sunday, 21 July 13

Page 51: Surfing the event stream

@samnewman#geecon

Graphite

AppServer

collectd

Sunday, 21 July 13

Page 52: Surfing the event stream

@samnewman#geecon

Graphite

App

Server

Server

collectd

Sunday, 21 July 13

Page 53: Surfing the event stream

@samnewman#geecon

Graphite

App

Server

Server

collectd Yammer Metrics

Sunday, 21 July 13

Page 54: Surfing the event stream

@samnewman#geecon

Graphite

App

Server

Server

collectd Yammer Metrics

Sunday, 21 July 13

Page 55: Surfing the event stream

@samnewman#geecon

Volume!

Sunday, 21 July 13

Page 56: Surfing the event stream

@samnewman#geecon

Aggregation!

Sunday, 21 July 13

Page 57: Surfing the event stream

@samnewman#geecon

www01.cpuUsage 42 1286269200

Sunday, 21 July 13

Page 58: Surfing the event stream

@samnewman#geecon

orderplaced 1 1286269200

Sunday, 21 July 13

Page 59: Surfing the event stream

@samnewman#geecon

orderplaced 1 1286269200

orderplaced 1 1286269200

Sunday, 21 July 13

Page 60: Surfing the event stream

@samnewman#geecon

orderplaced 1 1286269200

orderplaced 1 1286269200

orderplaced = 1

Sunday, 21 July 13

Page 61: Surfing the event stream

@samnewman#geecon

StatsD

Sunday, 21 July 13

Page 62: Surfing the event stream

@samnewman#geecon

Counters

ordersplaced:1|c

Sunday, 21 July 13

Page 63: Surfing the event stream

@samnewman#geecon

timings

orderduration:140|ms

Sunday, 21 July 13

Page 64: Surfing the event stream

@samnewman#geecon

StatsD

Client Client

Graphite

Sunday, 21 July 13

Page 65: Surfing the event stream

@samnewman#geecon

StatsD

Client Client

Graphite

Sunday, 21 July 13

Page 66: Surfing the event stream

@samnewman#geecon

StatsD

Client Client

Graphite

Sunday, 21 July 13

Page 67: Surfing the event stream

@samnewman#geecon

Riemann

Sunday, 21 July 13

Page 68: Surfing the event stream

@samnewman#geecon

Riemann

Sunday, 21 July 13

Page 69: Surfing the event stream

@samnewman#geecon

Riemann

Sunday, 21 July 13

Page 70: Surfing the event stream

@samnewman#geecon

Riemann

Sunday, 21 July 13

Page 71: Surfing the event stream

@samnewman#geecon

Riemann

Client Client

Graphite

Sunday, 21 July 13

Page 72: Surfing the event stream

@samnewman#geeconSunday, 21 July 13

Page 73: Surfing the event stream

@samnewman#geecon

(service "api req") (percentiles 5 [0.5 0.95 0.99] index))

Sunday, 21 July 13

Page 74: Surfing the event stream

@samnewman#geecon

(service "api req") (percentiles 5 [0.5 0.95 0.99] index))

Sunday, 21 July 13

Page 75: Surfing the event stream

@samnewman#geecon

(def tell-ops (rollup 5 3600 (email "[email protected]")))

(streams (where (state "critical") tell-ops))

Sunday, 21 July 13

Page 76: Surfing the event stream

@samnewman#geecon

(let [client (tcp-client :host "aggregator")] (by [:host :service] (changed :state (forward client))))

Sunday, 21 July 13

Page 77: Surfing the event stream

@samnewman#geecon

Riemann Server

Client Client

Sunday, 21 July 13

Page 78: Surfing the event stream

@samnewman#geecon

Riemann Server

Client Client

Riemann Server

Client Client

Sunday, 21 July 13

Page 79: Surfing the event stream

@samnewman#geecon

Riemann Server

Client Client

Riemann Server

Client Client

Riemann Server

Sunday, 21 July 13

Page 80: Surfing the event stream

@samnewman#geecon

So What Do We Have?

Sunday, 21 July 13

Page 81: Surfing the event stream

@samnewman#geecon

Server Server

GraphiteGraylog 2

Server

Sunday, 21 July 13

Page 82: Surfing the event stream

@samnewman#geeconSunday, 21 July 13

Page 83: Surfing the event stream

@samnewman#geeconSunday, 21 July 13

Page 84: Surfing the event stream

@samnewman#geeconSunday, 21 July 13

Page 85: Surfing the event stream

@samnewman#geecon

Server Server

Graphite Graylog 2Dashboard A

Dashboard B

Dashboard C

Server

Sunday, 21 July 13

Page 86: Surfing the event stream

@samnewman#geecon

Server Server

StatsD/Riemann

Graylog 2

Graphite

Dashboard A

Dashboard B

Dashboard C

Sunday, 21 July 13

Page 87: Surfing the event stream

@samnewman#geecon

http://shopify.github.io/dashing/

Sunday, 21 July 13

Page 88: Surfing the event stream

@samnewman#geeconSunday, 21 July 13

Page 89: Surfing the event stream

@samnewman#geeconSunday, 21 July 13

Page 90: Surfing the event stream

@samnewman#geeconSunday, 21 July 13

Page 91: Surfing the event stream

@samnewman#geeconSunday, 21 July 13

Page 92: Surfing the event stream

@samnewman#geecon

RealtimeAggregator

Sunday, 21 July 13

Page 93: Surfing the event stream

@samnewman#geecon

RealtimeAggregator

Sunday, 21 July 13

Page 94: Surfing the event stream

@samnewman#geecon

RealtimeAggregator

Sunday, 21 July 13

Page 95: Surfing the event stream

@samnewman#geecon

RealtimeAggregator

Data is lost!

Sunday, 21 July 13

Page 96: Surfing the event stream

@samnewman#geecon

RealtimeAggregator

Data is lost!

Sunday, 21 July 13

Page 97: Surfing the event stream

@samnewman#geecon

Real-time metrics requires upfront

knowledge

Sunday, 21 July 13

Page 98: Surfing the event stream

@samnewman#geecon

RealtimeAggregator

Sunday, 21 July 13

Page 99: Surfing the event stream

@samnewman#geecon

RealtimeAggregator

Sunday, 21 July 13

Page 100: Surfing the event stream

@samnewman#geecon

RealtimeAggregator

Lossless Event Store

Sunday, 21 July 13

Page 101: Surfing the event stream

@samnewman#geecon

RealtimeAggregator

Lossless Event Store

Sunday, 21 July 13

Page 102: Surfing the event stream

@samnewman#geecon

RealtimeAggregator

Lossless Event Store

HadoopHBase

Cassandra

Sunday, 21 July 13

Page 103: Surfing the event stream

@samnewman#geecon

Riemann Server

Client Client

Sunday, 21 July 13

Page 104: Surfing the event stream

@samnewman#geecon

Riemann Server

Client Client

Lossless Event Store

Sunday, 21 July 13

Page 105: Surfing the event stream

@samnewman#geecon

Event Sourcing

Sunday, 21 July 13

Page 106: Surfing the event stream

@samnewman#geecon

But...

Sunday, 21 July 13

Page 107: Surfing the event stream

@samnewman#geecon

RealtimeAggregator

Sunday, 21 July 13

Page 108: Surfing the event stream

@samnewman#geecon

Lossless Event Store

RealtimeAggregator

Sunday, 21 July 13

Page 109: Surfing the event stream

@samnewman#geecon

Can I have one view?

Lossless Event Store

RealtimeAggregator

Sunday, 21 July 13

Page 110: Surfing the event stream

@samnewman#geecon

http://nathanmarz.com/

Sunday, 21 July 13

Page 111: Surfing the event stream

@samnewman#geecon

Lossless Event Store

Realtime Aggregator

Sunday, 21 July 13

Page 112: Surfing the event stream

@samnewman#geecon

Lossless Event Store

Realtime Aggregator

Sunday, 21 July 13

Page 113: Surfing the event stream

@samnewman#geecon

Lossless Event Store

Realtime Aggregator

Up to date, but only for a small window

Sunday, 21 July 13

Page 114: Surfing the event stream

@samnewman#geecon

Lossless Event Store

Realtime Aggregator

Consistent, but out of date

Up to date, but only for a small window

Sunday, 21 July 13

Page 115: Surfing the event stream

@samnewman#geecon

Lossless Event Store

Realtime Aggregator

Unified Query

Consistent, but out of date

Up to date, but only for a small window

Sunday, 21 July 13

Page 116: Surfing the event stream

@samnewman#geecon

Lossless Event Store

Realtime Aggregator

Lambda Architecture

Unified Query

Consistent, but out of date

Up to date, but only for a small window

Sunday, 21 July 13

Page 117: Surfing the event stream

@samnewman#geecon

The Future?

Sunday, 21 July 13

Page 118: Surfing the event stream

@samnewman#geecon

Server Server

Aggregating Relay

Graphite

Graylog 2

Hadoop

Sunday, 21 July 13

Page 119: Surfing the event stream

@samnewman#geecon

Server Server

Aggregating Relay

Graphite

Graylog 2

Hadoop

Unified Query

Sunday, 21 July 13

Page 120: Surfing the event stream

@samnewman#geeconSunday, 21 July 13

Page 121: Surfing the event stream

@samnewman#geecon

All Your Data

Sunday, 21 July 13

Page 122: Surfing the event stream

@samnewman#geecon

All Your Data

In Realtime

Sunday, 21 July 13

Page 123: Surfing the event stream

@samnewman#geecon

All Your Data

In Realtime

Sunday, 21 July 13

Page 124: Surfing the event stream

@samnewman#geeconSunday, 21 July 13

Page 125: Surfing the event stream

@samnewman#geecon

Find and free your data

Sunday, 21 July 13

Page 126: Surfing the event stream

@samnewman#geecon

Find and free your data

Start simple

Sunday, 21 July 13

Page 127: Surfing the event stream

@samnewman#geecon

Find and free your data

Start simple

Create different views for different stakeholders

Sunday, 21 July 13

Page 128: Surfing the event stream

@samnewman#geecon

Find and free your data

Start simple

Create different views for different stakeholders

Don’t be scared of real-time!

Sunday, 21 July 13

Page 129: Surfing the event stream

@samnewman#geecon

[email protected]@samnewman

Sunday, 21 July 13