Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

Post on 05-Jul-2015

424 views 0 download

description

Cassandra is a NoSQL Column Family oriented database with a built in scalability. More and more companies are faced with the scale challenge every day and Cassandra is often picked as a weapon of choice. Cassandra being NoSQL does not mean that data modeling does not matter, in reality it is crucial to any successful use out there. Lately more and more developers are asking me to tell more about the data modeling (compared with the relational perspective). That’s what this talks is all about.

Transcript of Javantura v2 - Data modeling with Apapche Cassandra - Marko Švaljek

Data  modeling  with  Apache  Cassandra

Marko Švaljek @msvaljek

About  me •  Software Developer at Kapsch •  Zagreb Cassandra Meetup

About  Cassandra •  Written in Java •  Wide Row Store •  Masterless •  DataStax

About  Cassandra •  No Single Point of failure ✓ •  Cross Datacenter ✓ •  Linear Scaling ✓ •  Data modeling ?

Data  Model •  Highly denormalized •  No foreign keys •  No referential integrity •  No joins

Compared  to  Relational •  Think about the questions up front •  No penalty for wide rows •  Disk is cheap •  Bring on the writes

Cassandra  Objects •  Keyspace (database) •  Table •  Primary key •  Index

Data  Organization

Wide  Row

CQL •  Cassandra Query Language •  Similar to SQL •  Learning curve •  No code reading

Create  table CREATE TABLE employees (

name text PRIMARY KEY,age int,role text

);

Inserting INSERT INTO employees (name, age) VALUES ('mario', 38); INSERT INTO employees (name, age) VALUES ('tom', 24);

Selecting SELECT * FROM employees; name | age | role -------+-----+------ tom | 24 | DEV mario | 38 | SM

Hmm  … SELECT * FROM employees WHERE age > 20; Bad Request: No indexed columns present in by-columns clause with Equal operator

How  it’s  stored

•  Every row is a partition •  There is no “next” row

Primary  key CREATE TABLE employees2 ( company text, name text, age int, role text, PRIMARY KEY (company, name));

Primary  key company | name | age | role ---------+-------+-----+------ B | john | 25 | DEV B | peter | 22 | DEV A | kate | 33 | PM A | mario | 38 | SM A | neno | 32 | DEV PRIMARY KEY (company, name)

Primary  key company | name | age | role ---------+-------+-----+------ B | john | 25 | DEV B | peter | 22 | DEV A | kate | 33 | PM A | mario | 38 | SM A | neno | 32 | DEV

Primary  key SELECT * FROM employees2 WHERE company = 'A' AND name > 'kate'; company | name | age | role ---------+-------+-----+------ A | mario | 38 | SM A | neno | 32 | DEV

Primary  key PRIMARY KEY (a, b)PRIMARY KEY ((x, y), z, w) Partition Key (row) – just the 1st Clustering Columns – the rest

One  to  Many  Users

CREATE TABLE users ( username text, first_name text, last_name text, email text, PRIMARY KEY (username));

One  to  Many  Videos

CREATE TABLE videos ( video_id uuid, video_name text, username text, description text, tags text, upload_date timestamp, PRIMARY KEY (video_id));

One  to  Many  User  videos

CREATE TABLE username_video_ind ( username text, video_id uuid, upload_date timestamp, video_name text, PRIMARY KEY (username, video_id));

Many  to  Many  Comments

CREATE TABLE comments_by_video ( video_id uuid, username text, comment_time timestamp, comment text, PRIMARY KEY (video_id, username)); CREATE TABLE comments_by_user ( username text, videoid uuid, comment_time timestamp, comment text, PRIMARY KEY (username, videoid));

Reverse  ordering CREATE TABLE credit_transaction ( username text, type text, date_time timestamp, credits int, PRIMARY KEY (username, datetime, type)) WITH CLUSTERING ORDER BY (datetime DESC, type ASC);

Q  &  A Thank you!

Don’t be afraid of writes!

Join Zagreb Cassandra Users!

@msvaljek

msvaljek@gmail.com