Storing time series data with Apache Cassandra

Post on 03-Aug-2015

153 views 1 download

Tags:

Transcript of Storing time series data with Apache Cassandra

@PatrickMcFadin

Patrick McFadinChief Evangelist for Apache Cassandra, DataStax

Storing Time Series Data with

1

My Background

…ran into this problem

Gave it my best shot

shard 1 shard 2 shard 3 shard 4

router

client

Patrick,All your wildest

dreams will come true.

Just add complexity!

A new plan

Dynamo Paper(2007)•How do we build a data store that is: • Reliable • Performant • “Always On” •Nothing new and shiny

Evolutionary. Real. Computer Science

Also the basis for Riak and Voldemort

BigTable(2006)

• Richer data model • 1 key. Lots of values • Fast sequential access • 38 Papers cited

Cassandra(2008)

• Distributed features of Dynamo • Data Model and storage from

BigTable • February 17, 2010 it graduated to

a top-level Apache project

A Data Ocean or Pond., Lake

An In-Memory Database

A Key-Value Store

A magical database unicorn that farts rainbows

Cassandra for Applications

APACHE

CASSANDRA

Basic Architecture

Row

Column 1

Partition Key 1

Column 2

Column 3

Column 4

Partition

Column 1

Partition Key 1

Column 2

Column 3

Column 4

Column 1

Partition Key 1

Column 2

Column 3

Column 4

Column 1

Partition Key 1

Column 2

Column 3

Column 4

Column 1

Partition Key 1

Column 2

Column 3

Column 4

Table Column 1

Partition Key 1

Column 2

Column 3

Column 4

Column 1

Partition Key 1

Column 2

Column 3

Column 4

Column 1

Partition Key 1

Column 2

Column 3

Column 4

Column 1

Partition Key 1

Column 2

Column 3

Column 4

Column 1

Partition Key 2

Column 2

Column 3

Column 4

Column 1

Column 2

Column 3

Column 4

Column 1

Column 2

Column 3

Column 4

Column 1

Column 2

Column 3

Column 4

Partition Key 2

Partition Key 2

Partition Key 2

Keyspace

Column 1

Partition Key 1

Column 2

Column 3

Column 4

Column 1

Partition Key 2

Column 2

Column 3

Column 4

Column 1

Partition Key 1

Column 2

Column 3

Column 4

Column 1

Partition Key 1

Column 2

Column 3

Column 4

Column 1

Partition Key 1

Column 2

Column 3

Column 4

Column 1

Partition Key 2

Column 2

Column 3

Column 4

Column 1

Partition Key 2

Column 2

Column 3

Column 4

Column 1

Partition Key 2

Column 2

Column 3

Column 4

Column 1

Partition Key 1

Column 2

Column 3

Column 4

Column 1

Partition Key 2

Column 2

Column 3

Column 4

Column 1

Partition Key 1

Column 2

Column 3

Column 4

Column 1

Partition Key 1

Column 2

Column 3

Column 4

Column 1

Partition Key 1

Column 2

Column 3

Column 4

Column 1

Partition Key 2

Column 2

Column 3

Column 4

Column 1

Partition Key 2

Column 2

Column 3

Column 4

Column 1

Partition Key 2

Column 2

Column 3

Column 4

Table 1 Table 2Keyspace 1

NodeServer

TokenServer•Each partition is a 128 bit value

•Consistent hash between 2-63 and 264 •Each node owns a range of those values

•The token is the beginning of that range to the next node’s token value

•Virtual Nodes break these down further

Data

Token Range

0 …

Cluster Server

Token Range

0 0-100

0-100

Cluster Server

Token Range

0 0-50

51 51-100

Server

0-50

51-100

Cluster Server

Token Range

0 0-25

26 26-50

51 51-75

76 76-100Server

ServerServer

0-25

76-100

26-5051-75

Replication10.0.0.1 00-25

DC1

DC1: RF=1

Node Primary

10.0.0.1 00-25

10.0.0.2 26-50

10.0.0.3 51-75

10.0.0.4 76-100

10.0.0.1 00-25

10.0.0.4 76-100

10.0.0.2 26-50

10.0.0.3 51-75

Replication10.0.0.1

00-25

10.0.0.4 76-100

10.0.0.2 26-50

10.0.0.3 51-75

DC1

DC1: RF=2

Node Primary Replica

10.0.0.1 00-25 76-100

10.0.0.2 26-50 00-25

10.0.0.3 51-75 26-50

10.0.0.4 76-100 51-75

76-100

00-25

26-50

51-75

ReplicationDC1

DC1: RF=3

Node Primary Replica Replica

10.0.0.1 00-25 76-100 51-75

10.0.0.2 26-50 00-25 76-100

10.0.0.3 51-75 26-50 00-25

10.0.0.4 76-100 51-75 26-50

10.0.0.1 00-25

10.0.0.4 76-100

10.0.0.2 26-50

10.0.0.3 51-75

76-100 51-75

00-25 76-100

26-50 00-25

51-75 26-50

ConsistencyDC1

DC1: RF=3

Node Primary Replica Replica

10.0.0.1 00-25 76-100 51-75

10.0.0.2 26-50 00-25 76-100

10.0.0.3 51-75 26-50 00-25

10.0.0.4 76-100 51-75 26-50

10.0.0.1 00-25

10.0.0.4 76-100

10.0.0.2 26-50

10.0.0.3 51-75

76-100 51-75

00-25 76-100

26-50 00-25

51-75 26-50

Client

Write to partition 15

Consistency level

Consistency Level Number of Nodes Acknowledged

One One - Read repair triggered

Local One One - Read repair in local DC

Quorum 51%

Local Quorum 51% in local DC

ConsistencyDC1

DC1: RF=3

Node Primary Replica Replica

10.0.0.1 00-25 76-100 51-75

10.0.0.2 26-50 00-25 76-100

10.0.0.3 51-75 26-50 00-25

10.0.0.4 76-100 51-75 26-50

10.0.0.1 00-25

10.0.0.4 76-100

10.0.0.2 26-50

10.0.0.3 51-75

76-100 51-75

00-25 76-100

26-50 00-25

51-75 26-50

Client

Write to partition 15 CL= One

ConsistencyDC1

DC1: RF=3

Node Primary Replica Replica

10.0.0.1 00-25 76-100 51-75

10.0.0.2 26-50 00-25 76-100

10.0.0.3 51-75 26-50 00-25

10.0.0.4 76-100 51-75 26-50

10.0.0.1 00-25

10.0.0.4 76-100

10.0.0.2 26-50

10.0.0.3 51-75

76-100 51-75

00-25 76-100

26-50 00-25

51-75 26-50

Client

Write to partition 15 CL= One

ConsistencyDC1

DC1: RF=3

Node Primary Replica Replica

10.0.0.1 00-25 76-100 51-75

10.0.0.2 26-50 00-25 76-100

10.0.0.3 51-75 26-50 00-25

10.0.0.4 76-100 51-75 26-50

10.0.0.1 00-25

10.0.0.4 76-100

10.0.0.2 26-50

10.0.0.3 51-75

76-100 51-75

00-25 76-100

26-50 00-25

51-75 26-50

Client

Write to partition 15 CL= Quorum

Multi-datacenterDC1

DC1: RF=3Node Primary Replica Replica

10.0.0.1 00-25 76-100 51-75

10.0.0.2 26-50 00-25 76-100

10.0.0.3 51-75 26-50 00-25

10.0.0.4 76-100 51-75 26-50

10.0.0.1 00-25

10.0.0.4 76-100

10.0.0.2 26-50

10.0.0.3 51-75

76-100 51-75

00-25 76-100

26-50 00-25

51-75 26-50

Client

Write to partition 15

DC2

10.1.0.1 00-25

10.1.0.4 76-100

10.1.0.2 26-50

10.1.0.3 51-75

76-100 51-75

00-25 76-100

26-50 00-25

51-75 26-50

Node Primary Replica Replica

10.0.0.1 00-25 76-100 51-75

10.0.0.2 26-50 00-25 76-100

10.0.0.3 51-75 26-50 00-25

10.0.0.4 76-100 51-75 26-50

DC2: RF=3

Multi-datacenterDC1

DC1: RF=3Node Primary Replica Replica

10.0.0.1 00-25 76-100 51-75

10.0.0.2 26-50 00-25 76-100

10.0.0.3 51-75 26-50 00-25

10.0.0.4 76-100 51-75 26-50

10.0.0.1 00-25

10.0.0.4 76-100

10.0.0.2 26-50

10.0.0.3 51-75

76-100 51-75

00-25 76-100

26-50 00-25

51-75 26-50

Client

Write to partition 15

DC2

10.1.0.1 00-25

10.1.0.4 76-100

10.1.0.2 26-50

10.1.0.3 51-75

76-100 51-75

00-25 76-100

26-50 00-25

51-75 26-50

Node Primary Replica Replica

10.0.0.1 00-25 76-100 51-75

10.0.0.2 26-50 00-25 76-100

10.0.0.3 51-75 26-50 00-25

10.0.0.4 76-100 51-75 26-50

DC2: RF=3

Multi-datacenterDC1

DC1: RF=3Node Primary Replica Replica

10.0.0.1 00-25 76-100 51-75

10.0.0.2 26-50 00-25 76-100

10.0.0.3 51-75 26-50 00-25

10.0.0.4 76-100 51-75 26-50

10.0.0.1 00-25

10.0.0.4 76-100

10.0.0.2 26-50

10.0.0.3 51-75

76-100 51-75

00-25 76-100

26-50 00-25

51-75 26-50

Client

Write to partition 15

DC2

10.1.0.1 00-25

10.1.0.4 76-100

10.1.0.2 26-50

10.1.0.3 51-75

76-100 51-75

00-25 76-100

26-50 00-25

51-75 26-50

Node Primary Replica Replica

10.0.0.1 00-25 76-100 51-75

10.0.0.2 26-50 00-25 76-100

10.0.0.3 51-75 26-50 00-25

10.0.0.4 76-100 51-75 26-50

DC2: RF=3

Cassandra Query Language - CQL

Table

CREATE TABLE weather_station ( id text, name text, country_code text, state_code text, call_sign text, lat double, long double, elevation double, PRIMARY KEY(id) );

Table Name

Column NameColumn CQL Type

Primary Key Designation Partition Key

Table

CREATE TABLE daily_aggregate_precip ( wsid text, year int, month int, day int, precipitation counter, PRIMARY KEY ((wsid), year, month, day) ) WITH CLUSTERING ORDER BY (year DESC, month DESC, day DESC);

Partition KeyClustering Columns

Order Override

Insert

INSERT INTO weather_station (id, call_sign, country_code, elevation, lat, long, name, state_code) VALUES ('727930:24233', 'KSEA', 'US', 121.9, 47.467, -122.32, 'SEATTLE SEATTLE-TACOMA INTL A', ‘WA');

Table Name Fields

Values

Partition Key: Required

Select

id | call_sign | country_code | elevation | lat | long | name | state_code--------------+-----------+--------------+-----------+--------+---------+-------------------------------+------------727930:24233 | KSEA | US | 121.9 | 47.467 | -122.32 | SEATTLE SEATTLE-TACOMA INTL A | WA

SELECT id, call_sign, country_code, elevation, lat, long, name, state_codeFROM weather_stationWHERE id = '727930:24233';

Fields

Table Name

Primary Key: Partition Key Required

Update

UPDATE weather_stationSET name = 'SeaTac International Airport'WHERE id = '727930:24233';

id | call_sign | country_code | elevation | lat | long | name | state_code--------------+-----------+--------------+-----------+--------+---------+------------------------------+------------727930:24233 | KSEA | US | 121.9 | 47.467 | -122.32 | SeaTac International Airport | WA

Table Name Fields to Update: Not in Primary Key

Primary Key

Delete

DELETE FROM weather_stationWHERE id = '727930:24233';

Table Name

Primary Key: Required

CollectionsSet

CREATE TABLE weather_station ( id text, name text, country_code text, state_code text, call_sign text, lat double, long double, elevation double, equipment set<text> PRIMARY KEY(id) );

equipment set<text>

CQL Type: For Ordering

Column Name

CollectionsSet

List

CREATE TABLE weather_station ( id text, name text, country_code text, state_code text, call_sign text, lat double, long double, elevation double, equipment set<text>, service_dates list<timestamp>, PRIMARY KEY(id) );

equipment set<text>

service_dates list<timestamp>

CQL Type

Column Name

CQL Type: For Ordering

Column Name

CollectionsSet

List

Map

CREATE TABLE weather_station ( id text, name text, country_code text, state_code text, call_sign text, lat double, long double, elevation double, equipment set<text>, service_dates list<timestamp>, service_notes map<timestamp,text>, PRIMARY KEY(id) );

equipment set<text>

service_dates list<timestamp>

service_notes map<timestamp,text>

CQL Type

Column Name

Column Name

CQL Key Type CQL Value Type

CQL Type: For Ordering

Column Name

UDF and UDAUser Defined Function

CREATE OR REPLACE AGGREGATE group_and_count(text) SFUNC state_group_and_countSTYPE map<text, int> INITCOND {};

CREATE FUNCTION state_group_and_count( state map<text, int>, type text ) CALLED ON NULL INPUTRETURNS map<text, int> LANGUAGE java AS ' Integer count = (Integer) state.get(type); if (count == null) count = 1; else count++; state.put(type, count); return state; ' ;

User Defined Aggregate

As of Cassandra 2.2

Example: Weather Station•Weather station collects data • Cassandra stores in sequence • Application reads in sequence

Queries supported

CREATE TABLE raw_weather_data ( wsid text, year int, month int, day int, hour int, temperature double, dewpoint double, pressure double, wind_direction int, wind_speed double, sky_condition int, sky_condition_text text, one_hour_precip double, six_hour_precip double, PRIMARY KEY ((wsid), year, month, day, hour) ) WITH CLUSTERING ORDER BY (year DESC, month DESC, day DESC, hour DESC);

Get weather data given •Weather Station ID •Weather Station ID and Time •Weather Station ID and Range of Time

Primary Key

CREATE TABLE raw_weather_data ( wsid text, year int, month int, day int, hour int, temperature double, dewpoint double, pressure double, wind_direction int, wind_speed double, sky_condition int, sky_condition_text text, one_hour_precip double, six_hour_precip double, PRIMARY KEY ((wsid), year, month, day, hour) ) WITH CLUSTERING ORDER BY (year DESC, month DESC, day DESC, hour DESC);

Primary key relationship

PRIMARY KEY ((wsid),year,month,day,hour)

Primary key relationship

Partition Key

PRIMARY KEY ((wsid),year,month,day,hour)

Primary key relationship

PRIMARY KEY ((wsid),year,month,day,hour)

Partition Key Clustering Columns

Primary key relationship

Partition Key Clustering Columns

10010:99999

PRIMARY KEY ((wsid),year,month,day,hour)

2005:12:1:10

-5.6

Primary key relationship

Partition Key Clustering Columns

10010:99999-5.3-4.9-5.1

2005:12:1:9 2005:12:1:8 2005:12:1:7

PRIMARY KEY ((wsid),year,month,day,hour)

Partition keys

10010:99999 Murmur3 Hash Token = 7224631062609997448

722266:13850 Murmur3 Hash Token = -6804302034103043898

INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature) VALUES (‘10010:99999’,2005,12,1,7,-5.6);

INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature) VALUES (‘722266:13850’,2005,12,1,7,-5.6);

Consistent hash. 128 bit number between 2-63 and 264

Partition keys

10010:99999 Murmur3 Hash Token = 15

722266:13850 Murmur3 Hash Token = 77

For this example, let’s make it a reasonable number

INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature) VALUES (‘10010:99999’,2005,12,1,7,-5.6);

INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature) VALUES (‘722266:13850’,2005,12,1,7,-5.6);

Data LocalityDC1

DC1: RF=3Node Primary Replica Replica

10.0.0.1 00-25 76-100 51-75

10.0.0.2 26-50 00-25 76-100

10.0.0.3 51-75 26-50 00-25

10.0.0.4 76-100 51-75 26-50

10.0.0.1 00-25

10.0.0.4 76-100

10.0.0.2 26-50

10.0.0.3 51-75

76-100 51-75

00-25 76-100

26-50 00-25

51-75 26-50

Client

Read partition 15

DC2

10.1.0.1 00-25

10.1.0.4 76-100

10.1.0.2 26-50

10.1.0.3 51-75

76-100 51-75

00-25 76-100

26-50 00-25

51-75 26-50

Node Primary Replica Replica

10.0.0.1 00-25 76-100 51-75

10.0.0.2 26-50 00-25 76-100

10.0.0.3 51-75 26-50 00-25

10.0.0.4 76-100 51-75 26-50

DC2: RF=3

Client

Read partition 15

Data Locality

wsid=‘10010:99999’ ?

1000 Node Cluster

You are here!

WritesCREATE TABLE raw_weather_data ( wsid text, year int, month int, day int, hour int, temperature double, dewpoint double, pressure double, wind_direction int, wind_speed double, sky_condition int, sky_condition_text text, one_hour_precip double, six_hour_precip double, PRIMARY KEY ((wsid), year, month, day, hour) ) WITH CLUSTERING ORDER BY (year DESC, month DESC, day DESC, hour DESC);

WritesCREATE TABLE raw_weather_data ( wsid text, year int, month int, day int, hour int, temperature double, PRIMARY KEY ((wsid), year, month, day, hour) ) WITH CLUSTERING ORDER BY (year DESC, month DESC, day DESC, hour DESC);

INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature) VALUES (‘10010:99999’,2005,12,1,10,-5.6);

INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature) VALUES (‘10010:99999’,2005,12,1,9,-5.1);

INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature) VALUES (‘10010:99999’,2005,12,1,8,-4.9);

INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature) VALUES (‘10010:99999’,2005,12,1,7,-5.3);

Write PathClient INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature)

VALUES (‘10010:99999’,2005,12,1,7,-5.3);

Column 1

Partition Key 1

Column 2

Column 3

Column 4

Column 1

Partition Key 1

Column 2

Column 3

Column 4

Memtable

SSTable

SSTable

SSTable

SSTable

Node

Commit Log Data * Compaction *

Date Tiered Compaction Strategy•Group similar time blocks •Never compact again •Used for high density

SSTable

SSTable

SSTable

T=2015-01-01 -> 2015-01-5

T=2015-01-06 -> 2015-01-10

T=2015-01-11 -> 2015-01-15

Storage Model - Logical View

2005:12:1:10

-5.6

2005:12:1:9

-5.1

2005:12:1:8

-4.9

10010:99999

10010:99999

10010:99999

wsid hour temperature

2005:12:1:7

-5.310010:99999

SELECT wsid, hour, temperatureFROM raw_weather_dataWHERE wsid=‘10010:99999’ AND year = 2005 AND month = 12 AND day = 1;

2005:12:1:10

-5.6 -5.3-4.9-5.1

Storage Model - Disk Layout

2005:12:1:9 2005:12:1:810010:99999

2005:12:1:7

Merged, Sorted and Stored Sequentially

SELECT wsid, hour, temperatureFROM raw_weather_dataWHERE wsid=‘10010:99999’ AND year = 2005 AND month = 12 AND day = 1;

2005:12:1:10

-5.6

2005:12:1:11

-4.9 -5.3-4.9-5.1

Storage Model - Disk Layout

2005:12:1:9 2005:12:1:810010:99999

2005:12:1:7

Merged, Sorted and Stored Sequentially

SELECT wsid, hour, temperatureFROM raw_weather_dataWHERE wsid=‘10010:99999’ AND year = 2005 AND month = 12 AND day = 1;

2005:12:1:10

-5.6

2005:12:1:11

-4.9 -5.3-4.9-5.1

Storage Model - Disk Layout

2005:12:1:9 2005:12:1:810010:99999

2005:12:1:7

Merged, Sorted and Stored Sequentially

SELECT wsid, hour, temperatureFROM raw_weather_dataWHERE wsid=‘10010:99999’ AND year = 2005 AND month = 12 AND day = 1;

2005:12:1:12

-5.4

Read PathClient

Column 1

Partition Key 1

Column 2

Column 3

Column 4

Column 1

Partition Key 1

Column 2

Column 3

Column 4

Memtable

SSTableSSTable

SSTable

Node

Data

SELECT wsid,hour,temperatureFROM raw_weather_dataWHERE wsid='10010:99999'AND year = 2005 AND month = 12 AND day = 1 AND hour >= 7 AND hour <= 10;

Query patterns• Range queries • “Slice” operation on disk

Single seek on disk

10010:99999

Partition key for locality

SELECT wsid,hour,temperatureFROM raw_weather_dataWHERE wsid='10010:99999'AND year = 2005 AND month = 12 AND day = 1 AND hour >= 7 AND hour <= 10;

2005:12:1:10

-5.6 -5.3-4.9-5.1

2005:12:1:9 2005:12:1:8 2005:12:1:7

Query patterns• Range queries • “Slice” operation on disk

Programmers like this

Sorted by event_time2005:12:1:10

-5.6

2005:12:1:9

-5.1

2005:12:1:8

-4.9

10010:99999

10010:99999

10010:99999

weather_station hour temperature

2005:12:1:7

-5.310010:99999

SELECT weatherstation,hour,temperature FROM temperature WHERE weatherstation_id=‘10010:99999' AND year = 2005 AND month = 12 AND day = 1 AND hour >= 7 AND hour <= 10;

Thank you!

Bring the questions

Follow me on twitter @PatrickMcFadin