@PatrickMcFadin
Patrick McFadinChief Evangelist for Apache Cassandra, DataStax
Storing Time Series Data with
1
My Background
…ran into this problem
Gave it my best shot
shard 1 shard 2 shard 3 shard 4
router
client
Patrick,All your wildest
dreams will come true.
Just add complexity!
A new plan
Dynamo Paper(2007)•How do we build a data store that is: • Reliable • Performant • “Always On” •Nothing new and shiny
Evolutionary. Real. Computer Science
Also the basis for Riak and Voldemort
BigTable(2006)
• Richer data model • 1 key. Lots of values • Fast sequential access • 38 Papers cited
Cassandra(2008)
• Distributed features of Dynamo • Data Model and storage from
BigTable • February 17, 2010 it graduated to
a top-level Apache project
A Data Ocean or Pond., Lake
An In-Memory Database
A Key-Value Store
A magical database unicorn that farts rainbows
Cassandra for Applications
APACHE
CASSANDRA
Basic Architecture
Row
Column 1
Partition Key 1
Column 2
Column 3
Column 4
Partition
Column 1
Partition Key 1
Column 2
Column 3
Column 4
Column 1
Partition Key 1
Column 2
Column 3
Column 4
Column 1
Partition Key 1
Column 2
Column 3
Column 4
Column 1
Partition Key 1
Column 2
Column 3
Column 4
Table Column 1
Partition Key 1
Column 2
Column 3
Column 4
Column 1
Partition Key 1
Column 2
Column 3
Column 4
Column 1
Partition Key 1
Column 2
Column 3
Column 4
Column 1
Partition Key 1
Column 2
Column 3
Column 4
Column 1
Partition Key 2
Column 2
Column 3
Column 4
Column 1
Column 2
Column 3
Column 4
Column 1
Column 2
Column 3
Column 4
Column 1
Column 2
Column 3
Column 4
Partition Key 2
Partition Key 2
Partition Key 2
Keyspace
Column 1
Partition Key 1
Column 2
Column 3
Column 4
Column 1
Partition Key 2
Column 2
Column 3
Column 4
Column 1
Partition Key 1
Column 2
Column 3
Column 4
Column 1
Partition Key 1
Column 2
Column 3
Column 4
Column 1
Partition Key 1
Column 2
Column 3
Column 4
Column 1
Partition Key 2
Column 2
Column 3
Column 4
Column 1
Partition Key 2
Column 2
Column 3
Column 4
Column 1
Partition Key 2
Column 2
Column 3
Column 4
Column 1
Partition Key 1
Column 2
Column 3
Column 4
Column 1
Partition Key 2
Column 2
Column 3
Column 4
Column 1
Partition Key 1
Column 2
Column 3
Column 4
Column 1
Partition Key 1
Column 2
Column 3
Column 4
Column 1
Partition Key 1
Column 2
Column 3
Column 4
Column 1
Partition Key 2
Column 2
Column 3
Column 4
Column 1
Partition Key 2
Column 2
Column 3
Column 4
Column 1
Partition Key 2
Column 2
Column 3
Column 4
Table 1 Table 2Keyspace 1
NodeServer
TokenServer•Each partition is a 128 bit value
•Consistent hash between 2-63 and 264 •Each node owns a range of those values
•The token is the beginning of that range to the next node’s token value
•Virtual Nodes break these down further
Data
Token Range
0 …
Cluster Server
Token Range
0 0-100
0-100
Cluster Server
Token Range
0 0-50
51 51-100
Server
0-50
51-100
Cluster Server
Token Range
0 0-25
26 26-50
51 51-75
76 76-100Server
ServerServer
0-25
76-100
26-5051-75
Replication10.0.0.1 00-25
DC1
DC1: RF=1
Node Primary
10.0.0.1 00-25
10.0.0.2 26-50
10.0.0.3 51-75
10.0.0.4 76-100
10.0.0.1 00-25
10.0.0.4 76-100
10.0.0.2 26-50
10.0.0.3 51-75
Replication10.0.0.1
00-25
10.0.0.4 76-100
10.0.0.2 26-50
10.0.0.3 51-75
DC1
DC1: RF=2
Node Primary Replica
10.0.0.1 00-25 76-100
10.0.0.2 26-50 00-25
10.0.0.3 51-75 26-50
10.0.0.4 76-100 51-75
76-100
00-25
26-50
51-75
ReplicationDC1
DC1: RF=3
Node Primary Replica Replica
10.0.0.1 00-25 76-100 51-75
10.0.0.2 26-50 00-25 76-100
10.0.0.3 51-75 26-50 00-25
10.0.0.4 76-100 51-75 26-50
10.0.0.1 00-25
10.0.0.4 76-100
10.0.0.2 26-50
10.0.0.3 51-75
76-100 51-75
00-25 76-100
26-50 00-25
51-75 26-50
ConsistencyDC1
DC1: RF=3
Node Primary Replica Replica
10.0.0.1 00-25 76-100 51-75
10.0.0.2 26-50 00-25 76-100
10.0.0.3 51-75 26-50 00-25
10.0.0.4 76-100 51-75 26-50
10.0.0.1 00-25
10.0.0.4 76-100
10.0.0.2 26-50
10.0.0.3 51-75
76-100 51-75
00-25 76-100
26-50 00-25
51-75 26-50
Client
Write to partition 15
Consistency level
Consistency Level Number of Nodes Acknowledged
One One - Read repair triggered
Local One One - Read repair in local DC
Quorum 51%
Local Quorum 51% in local DC
ConsistencyDC1
DC1: RF=3
Node Primary Replica Replica
10.0.0.1 00-25 76-100 51-75
10.0.0.2 26-50 00-25 76-100
10.0.0.3 51-75 26-50 00-25
10.0.0.4 76-100 51-75 26-50
10.0.0.1 00-25
10.0.0.4 76-100
10.0.0.2 26-50
10.0.0.3 51-75
76-100 51-75
00-25 76-100
26-50 00-25
51-75 26-50
Client
Write to partition 15 CL= One
ConsistencyDC1
DC1: RF=3
Node Primary Replica Replica
10.0.0.1 00-25 76-100 51-75
10.0.0.2 26-50 00-25 76-100
10.0.0.3 51-75 26-50 00-25
10.0.0.4 76-100 51-75 26-50
10.0.0.1 00-25
10.0.0.4 76-100
10.0.0.2 26-50
10.0.0.3 51-75
76-100 51-75
00-25 76-100
26-50 00-25
51-75 26-50
Client
Write to partition 15 CL= One
ConsistencyDC1
DC1: RF=3
Node Primary Replica Replica
10.0.0.1 00-25 76-100 51-75
10.0.0.2 26-50 00-25 76-100
10.0.0.3 51-75 26-50 00-25
10.0.0.4 76-100 51-75 26-50
10.0.0.1 00-25
10.0.0.4 76-100
10.0.0.2 26-50
10.0.0.3 51-75
76-100 51-75
00-25 76-100
26-50 00-25
51-75 26-50
Client
Write to partition 15 CL= Quorum
Multi-datacenterDC1
DC1: RF=3Node Primary Replica Replica
10.0.0.1 00-25 76-100 51-75
10.0.0.2 26-50 00-25 76-100
10.0.0.3 51-75 26-50 00-25
10.0.0.4 76-100 51-75 26-50
10.0.0.1 00-25
10.0.0.4 76-100
10.0.0.2 26-50
10.0.0.3 51-75
76-100 51-75
00-25 76-100
26-50 00-25
51-75 26-50
Client
Write to partition 15
DC2
10.1.0.1 00-25
10.1.0.4 76-100
10.1.0.2 26-50
10.1.0.3 51-75
76-100 51-75
00-25 76-100
26-50 00-25
51-75 26-50
Node Primary Replica Replica
10.0.0.1 00-25 76-100 51-75
10.0.0.2 26-50 00-25 76-100
10.0.0.3 51-75 26-50 00-25
10.0.0.4 76-100 51-75 26-50
DC2: RF=3
Multi-datacenterDC1
DC1: RF=3Node Primary Replica Replica
10.0.0.1 00-25 76-100 51-75
10.0.0.2 26-50 00-25 76-100
10.0.0.3 51-75 26-50 00-25
10.0.0.4 76-100 51-75 26-50
10.0.0.1 00-25
10.0.0.4 76-100
10.0.0.2 26-50
10.0.0.3 51-75
76-100 51-75
00-25 76-100
26-50 00-25
51-75 26-50
Client
Write to partition 15
DC2
10.1.0.1 00-25
10.1.0.4 76-100
10.1.0.2 26-50
10.1.0.3 51-75
76-100 51-75
00-25 76-100
26-50 00-25
51-75 26-50
Node Primary Replica Replica
10.0.0.1 00-25 76-100 51-75
10.0.0.2 26-50 00-25 76-100
10.0.0.3 51-75 26-50 00-25
10.0.0.4 76-100 51-75 26-50
DC2: RF=3
Multi-datacenterDC1
DC1: RF=3Node Primary Replica Replica
10.0.0.1 00-25 76-100 51-75
10.0.0.2 26-50 00-25 76-100
10.0.0.3 51-75 26-50 00-25
10.0.0.4 76-100 51-75 26-50
10.0.0.1 00-25
10.0.0.4 76-100
10.0.0.2 26-50
10.0.0.3 51-75
76-100 51-75
00-25 76-100
26-50 00-25
51-75 26-50
Client
Write to partition 15
DC2
10.1.0.1 00-25
10.1.0.4 76-100
10.1.0.2 26-50
10.1.0.3 51-75
76-100 51-75
00-25 76-100
26-50 00-25
51-75 26-50
Node Primary Replica Replica
10.0.0.1 00-25 76-100 51-75
10.0.0.2 26-50 00-25 76-100
10.0.0.3 51-75 26-50 00-25
10.0.0.4 76-100 51-75 26-50
DC2: RF=3
Cassandra Query Language - CQL
Table
CREATE TABLE weather_station ( id text, name text, country_code text, state_code text, call_sign text, lat double, long double, elevation double, PRIMARY KEY(id) );
Table Name
Column NameColumn CQL Type
Primary Key Designation Partition Key
Table
CREATE TABLE daily_aggregate_precip ( wsid text, year int, month int, day int, precipitation counter, PRIMARY KEY ((wsid), year, month, day) ) WITH CLUSTERING ORDER BY (year DESC, month DESC, day DESC);
Partition KeyClustering Columns
Order Override
Insert
INSERT INTO weather_station (id, call_sign, country_code, elevation, lat, long, name, state_code) VALUES ('727930:24233', 'KSEA', 'US', 121.9, 47.467, -122.32, 'SEATTLE SEATTLE-TACOMA INTL A', ‘WA');
Table Name Fields
Values
Partition Key: Required
Select
id | call_sign | country_code | elevation | lat | long | name | state_code--------------+-----------+--------------+-----------+--------+---------+-------------------------------+------------727930:24233 | KSEA | US | 121.9 | 47.467 | -122.32 | SEATTLE SEATTLE-TACOMA INTL A | WA
SELECT id, call_sign, country_code, elevation, lat, long, name, state_codeFROM weather_stationWHERE id = '727930:24233';
Fields
Table Name
Primary Key: Partition Key Required
Update
UPDATE weather_stationSET name = 'SeaTac International Airport'WHERE id = '727930:24233';
id | call_sign | country_code | elevation | lat | long | name | state_code--------------+-----------+--------------+-----------+--------+---------+------------------------------+------------727930:24233 | KSEA | US | 121.9 | 47.467 | -122.32 | SeaTac International Airport | WA
Table Name Fields to Update: Not in Primary Key
Primary Key
Delete
DELETE FROM weather_stationWHERE id = '727930:24233';
Table Name
Primary Key: Required
CollectionsSet
CREATE TABLE weather_station ( id text, name text, country_code text, state_code text, call_sign text, lat double, long double, elevation double, equipment set<text> PRIMARY KEY(id) );
equipment set<text>
CQL Type: For Ordering
Column Name
CollectionsSet
List
CREATE TABLE weather_station ( id text, name text, country_code text, state_code text, call_sign text, lat double, long double, elevation double, equipment set<text>, service_dates list<timestamp>, PRIMARY KEY(id) );
equipment set<text>
service_dates list<timestamp>
CQL Type
Column Name
CQL Type: For Ordering
Column Name
CollectionsSet
List
Map
CREATE TABLE weather_station ( id text, name text, country_code text, state_code text, call_sign text, lat double, long double, elevation double, equipment set<text>, service_dates list<timestamp>, service_notes map<timestamp,text>, PRIMARY KEY(id) );
equipment set<text>
service_dates list<timestamp>
service_notes map<timestamp,text>
CQL Type
Column Name
Column Name
CQL Key Type CQL Value Type
CQL Type: For Ordering
Column Name
UDF and UDAUser Defined Function
CREATE OR REPLACE AGGREGATE group_and_count(text) SFUNC state_group_and_countSTYPE map<text, int> INITCOND {};
CREATE FUNCTION state_group_and_count( state map<text, int>, type text ) CALLED ON NULL INPUTRETURNS map<text, int> LANGUAGE java AS ' Integer count = (Integer) state.get(type); if (count == null) count = 1; else count++; state.put(type, count); return state; ' ;
User Defined Aggregate
As of Cassandra 2.2
Example: Weather Station•Weather station collects data • Cassandra stores in sequence • Application reads in sequence
Queries supported
CREATE TABLE raw_weather_data ( wsid text, year int, month int, day int, hour int, temperature double, dewpoint double, pressure double, wind_direction int, wind_speed double, sky_condition int, sky_condition_text text, one_hour_precip double, six_hour_precip double, PRIMARY KEY ((wsid), year, month, day, hour) ) WITH CLUSTERING ORDER BY (year DESC, month DESC, day DESC, hour DESC);
Get weather data given •Weather Station ID •Weather Station ID and Time •Weather Station ID and Range of Time
Primary Key
CREATE TABLE raw_weather_data ( wsid text, year int, month int, day int, hour int, temperature double, dewpoint double, pressure double, wind_direction int, wind_speed double, sky_condition int, sky_condition_text text, one_hour_precip double, six_hour_precip double, PRIMARY KEY ((wsid), year, month, day, hour) ) WITH CLUSTERING ORDER BY (year DESC, month DESC, day DESC, hour DESC);
Primary key relationship
PRIMARY KEY ((wsid),year,month,day,hour)
Primary key relationship
Partition Key
PRIMARY KEY ((wsid),year,month,day,hour)
Primary key relationship
PRIMARY KEY ((wsid),year,month,day,hour)
Partition Key Clustering Columns
Primary key relationship
Partition Key Clustering Columns
10010:99999
PRIMARY KEY ((wsid),year,month,day,hour)
2005:12:1:10
-5.6
Primary key relationship
Partition Key Clustering Columns
10010:99999-5.3-4.9-5.1
2005:12:1:9 2005:12:1:8 2005:12:1:7
PRIMARY KEY ((wsid),year,month,day,hour)
Partition keys
10010:99999 Murmur3 Hash Token = 7224631062609997448
722266:13850 Murmur3 Hash Token = -6804302034103043898
INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature) VALUES (‘10010:99999’,2005,12,1,7,-5.6);
INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature) VALUES (‘722266:13850’,2005,12,1,7,-5.6);
Consistent hash. 128 bit number between 2-63 and 264
Partition keys
10010:99999 Murmur3 Hash Token = 15
722266:13850 Murmur3 Hash Token = 77
For this example, let’s make it a reasonable number
INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature) VALUES (‘10010:99999’,2005,12,1,7,-5.6);
INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature) VALUES (‘722266:13850’,2005,12,1,7,-5.6);
Data LocalityDC1
DC1: RF=3Node Primary Replica Replica
10.0.0.1 00-25 76-100 51-75
10.0.0.2 26-50 00-25 76-100
10.0.0.3 51-75 26-50 00-25
10.0.0.4 76-100 51-75 26-50
10.0.0.1 00-25
10.0.0.4 76-100
10.0.0.2 26-50
10.0.0.3 51-75
76-100 51-75
00-25 76-100
26-50 00-25
51-75 26-50
Client
Read partition 15
DC2
10.1.0.1 00-25
10.1.0.4 76-100
10.1.0.2 26-50
10.1.0.3 51-75
76-100 51-75
00-25 76-100
26-50 00-25
51-75 26-50
Node Primary Replica Replica
10.0.0.1 00-25 76-100 51-75
10.0.0.2 26-50 00-25 76-100
10.0.0.3 51-75 26-50 00-25
10.0.0.4 76-100 51-75 26-50
DC2: RF=3
Client
Read partition 15
Data Locality
wsid=‘10010:99999’ ?
1000 Node Cluster
You are here!
WritesCREATE TABLE raw_weather_data ( wsid text, year int, month int, day int, hour int, temperature double, dewpoint double, pressure double, wind_direction int, wind_speed double, sky_condition int, sky_condition_text text, one_hour_precip double, six_hour_precip double, PRIMARY KEY ((wsid), year, month, day, hour) ) WITH CLUSTERING ORDER BY (year DESC, month DESC, day DESC, hour DESC);
WritesCREATE TABLE raw_weather_data ( wsid text, year int, month int, day int, hour int, temperature double, PRIMARY KEY ((wsid), year, month, day, hour) ) WITH CLUSTERING ORDER BY (year DESC, month DESC, day DESC, hour DESC);
INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature) VALUES (‘10010:99999’,2005,12,1,10,-5.6);
INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature) VALUES (‘10010:99999’,2005,12,1,9,-5.1);
INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature) VALUES (‘10010:99999’,2005,12,1,8,-4.9);
INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature) VALUES (‘10010:99999’,2005,12,1,7,-5.3);
Write PathClient INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature)
VALUES (‘10010:99999’,2005,12,1,7,-5.3);
Column 1
Partition Key 1
Column 2
Column 3
Column 4
Column 1
Partition Key 1
Column 2
Column 3
Column 4
Memtable
SSTable
SSTable
SSTable
SSTable
Node
Commit Log Data * Compaction *
Date Tiered Compaction Strategy•Group similar time blocks •Never compact again •Used for high density
SSTable
SSTable
SSTable
T=2015-01-01 -> 2015-01-5
T=2015-01-06 -> 2015-01-10
T=2015-01-11 -> 2015-01-15
Storage Model - Logical View
2005:12:1:10
-5.6
2005:12:1:9
-5.1
2005:12:1:8
-4.9
10010:99999
10010:99999
10010:99999
wsid hour temperature
2005:12:1:7
-5.310010:99999
SELECT wsid, hour, temperatureFROM raw_weather_dataWHERE wsid=‘10010:99999’ AND year = 2005 AND month = 12 AND day = 1;
2005:12:1:10
-5.6 -5.3-4.9-5.1
Storage Model - Disk Layout
2005:12:1:9 2005:12:1:810010:99999
2005:12:1:7
Merged, Sorted and Stored Sequentially
SELECT wsid, hour, temperatureFROM raw_weather_dataWHERE wsid=‘10010:99999’ AND year = 2005 AND month = 12 AND day = 1;
2005:12:1:10
-5.6
2005:12:1:11
-4.9 -5.3-4.9-5.1
Storage Model - Disk Layout
2005:12:1:9 2005:12:1:810010:99999
2005:12:1:7
Merged, Sorted and Stored Sequentially
SELECT wsid, hour, temperatureFROM raw_weather_dataWHERE wsid=‘10010:99999’ AND year = 2005 AND month = 12 AND day = 1;
2005:12:1:10
-5.6
2005:12:1:11
-4.9 -5.3-4.9-5.1
Storage Model - Disk Layout
2005:12:1:9 2005:12:1:810010:99999
2005:12:1:7
Merged, Sorted and Stored Sequentially
SELECT wsid, hour, temperatureFROM raw_weather_dataWHERE wsid=‘10010:99999’ AND year = 2005 AND month = 12 AND day = 1;
2005:12:1:12
-5.4
Read PathClient
Column 1
Partition Key 1
Column 2
Column 3
Column 4
Column 1
Partition Key 1
Column 2
Column 3
Column 4
Memtable
SSTableSSTable
SSTable
Node
Data
SELECT wsid,hour,temperatureFROM raw_weather_dataWHERE wsid='10010:99999'AND year = 2005 AND month = 12 AND day = 1 AND hour >= 7 AND hour <= 10;
Query patterns• Range queries • “Slice” operation on disk
Single seek on disk
10010:99999
Partition key for locality
SELECT wsid,hour,temperatureFROM raw_weather_dataWHERE wsid='10010:99999'AND year = 2005 AND month = 12 AND day = 1 AND hour >= 7 AND hour <= 10;
2005:12:1:10
-5.6 -5.3-4.9-5.1
2005:12:1:9 2005:12:1:8 2005:12:1:7
Query patterns• Range queries • “Slice” operation on disk
Programmers like this
Sorted by event_time2005:12:1:10
-5.6
2005:12:1:9
-5.1
2005:12:1:8
-4.9
10010:99999
10010:99999
10010:99999
weather_station hour temperature
2005:12:1:7
-5.310010:99999
SELECT weatherstation,hour,temperature FROM temperature WHERE weatherstation_id=‘10010:99999' AND year = 2005 AND month = 12 AND day = 1 AND hour >= 7 AND hour <= 10;
Thank you!
Bring the questions
Follow me on twitter @PatrickMcFadin
Top Related