Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

26
Data modeling with Apache Cassandra Marko Švaljek @msvaljek

description

Cassandra is a NoSQL Column Family oriented database with a built in scalability. More and more companies are faced with the scale challenge every day and Cassandra is often picked as a weapon of choice. Cassandra being NoSQL does not mean that data modeling does not matter, in reality it is crucial to any successful use out there. Lately more and more developers are asking me to tell more about the data modeling (compared with the relational perspective). That’s what this talks is all about.

Transcript of Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

Page 1: Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

Data  modeling  with  Apache  Cassandra

Marko Švaljek @msvaljek

Page 2: Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

About  me •  Software Developer at Kapsch •  Zagreb Cassandra Meetup

Page 3: Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

About  Cassandra •  Written in Java •  Wide Row Store •  Masterless •  DataStax

Page 4: Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

About  Cassandra •  No Single Point of failure ✓ •  Cross Datacenter ✓ •  Linear Scaling ✓ •  Data modeling ?

Page 5: Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

Data  Model •  Highly denormalized •  No foreign keys •  No referential integrity •  No joins

Page 6: Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

Compared  to  Relational •  Think about the questions up front •  No penalty for wide rows •  Disk is cheap •  Bring on the writes

Page 7: Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

Cassandra  Objects •  Keyspace (database) •  Table •  Primary key •  Index

Page 8: Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

Data  Organization

Page 9: Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

Wide  Row

Page 10: Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

CQL •  Cassandra Query Language •  Similar to SQL •  Learning curve •  No code reading

Page 11: Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

Create  table CREATE TABLE employees (

name text PRIMARY KEY,age int,role text

);

Page 12: Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

Inserting INSERT INTO employees (name, age) VALUES ('mario', 38); INSERT INTO employees (name, age) VALUES ('tom', 24);

Page 13: Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

Selecting SELECT * FROM employees; name | age | role -------+-----+------ tom | 24 | DEV mario | 38 | SM

Page 14: Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

Hmm  … SELECT * FROM employees WHERE age > 20; Bad Request: No indexed columns present in by-columns clause with Equal operator

Page 15: Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

How  it’s  stored

•  Every row is a partition •  There is no “next” row

Page 16: Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

Primary  key CREATE TABLE employees2 ( company text, name text, age int, role text, PRIMARY KEY (company, name));

Page 17: Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

Primary  key company | name | age | role ---------+-------+-----+------ B | john | 25 | DEV B | peter | 22 | DEV A | kate | 33 | PM A | mario | 38 | SM A | neno | 32 | DEV PRIMARY KEY (company, name)

Page 18: Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

Primary  key company | name | age | role ---------+-------+-----+------ B | john | 25 | DEV B | peter | 22 | DEV A | kate | 33 | PM A | mario | 38 | SM A | neno | 32 | DEV

Page 19: Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

Primary  key SELECT * FROM employees2 WHERE company = 'A' AND name > 'kate'; company | name | age | role ---------+-------+-----+------ A | mario | 38 | SM A | neno | 32 | DEV

Page 20: Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

Primary  key PRIMARY KEY (a, b)PRIMARY KEY ((x, y), z, w) Partition Key (row) – just the 1st Clustering Columns – the rest

Page 21: Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

One  to  Many  Users

CREATE TABLE users ( username text, first_name text, last_name text, email text, PRIMARY KEY (username));

Page 22: Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

One  to  Many  Videos

CREATE TABLE videos ( video_id uuid, video_name text, username text, description text, tags text, upload_date timestamp, PRIMARY KEY (video_id));

Page 23: Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

One  to  Many  User  videos

CREATE TABLE username_video_ind ( username text, video_id uuid, upload_date timestamp, video_name text, PRIMARY KEY (username, video_id));

Page 24: Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

Many  to  Many  Comments

CREATE TABLE comments_by_video ( video_id uuid, username text, comment_time timestamp, comment text, PRIMARY KEY (video_id, username)); CREATE TABLE comments_by_user ( username text, videoid uuid, comment_time timestamp, comment text, PRIMARY KEY (username, videoid));

Page 25: Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

Reverse  ordering CREATE TABLE credit_transaction ( username text, type text, date_time timestamp, credits int, PRIMARY KEY (username, datetime, type)) WITH CLUSTERING ORDER BY (datetime DESC, type ASC);

Page 26: Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

Q  &  A Thank you!

Don’t be afraid of writes!

Join Zagreb Cassandra Users!

@msvaljek

[email protected]