Cassandra 2.0 better, faster, stronger

24
©2013 DataStax Confidential. Do not distribute without consent. @PatrickMcFadin Patrick McFadin Chief Evangelist/Solution Architect - DataStax Cassandra 2.0: Better, Stronger, Faster Thursday, October 3, 13

description

Apache Cassandra 2.0 is out - now there's no reason not to ditch that ol' legacy relational system for your important online applications. Cassandra 2.0 includes big impact features like Light Weight Transactions and Triggers. Do you know about the other new enhancements that got lost in the noise. Let's put the spotlight on all the things! Changes in memory management, file handling and internals. Low hype but they pack a big punch. While we were at it, we also did a bit of house cleaning.

Transcript of Cassandra 2.0 better, faster, stronger

Page 1: Cassandra 2.0   better, faster, stronger

©2013 DataStax Confidential. Do not distribute without consent.

@PatrickMcFadin

Patrick McFadinChief Evangelist/Solution Architect - DataStax

Cassandra 2.0: Better, Stronger, Faster

Thursday, October 3, 13

Page 2: Cassandra 2.0   better, faster, stronger

Five Years of Cassandra

Jul-09 May-10 Feb-11 Dec-11 Oct-12 Jul-13

0.1 0.3 0.6 0.7 1.0 1.2...

2.0

DSE

Jul-08

Thursday, October 3, 13

Page 3: Cassandra 2.0   better, faster, stronger

Cassandra 2.0 - Big new features

Thursday, October 3, 13

Page 4: Cassandra 2.0   better, faster, stronger

SELECT * FROM usersWHERE username = ’jbellis’

[empty resultset]

Session 1SELECT * FROM usersWHERE username = ’jbellis’

[empty resultset]

Session 2

Lightweight transactions: the problem

INSERT INTO users (username,password)VALUES (’jbellis’,‘xdg44hh’)

INSERT INTO users (userName,password)VALUES (’jbellis’,‘8dhh43k’)

It’s a Race!

Who wins?

Thursday, October 3, 13

Page 5: Cassandra 2.0   better, faster, stronger

Client(locks)

Coordinatorrequest

Replica

internalrequest

Why Locking Doesn’t Work

• Client locks•Write times out• Lock released•Hint is replayed!!

Thursday, October 3, 13

Page 6: Cassandra 2.0   better, faster, stronger

Client(locks)

Coordinatorrequest

Replica

internalrequest

X

Why Locking Doesn’t Work

• Client locks•Write times out• Lock released•Hint is replayed!!

Thursday, October 3, 13

Page 7: Cassandra 2.0   better, faster, stronger

Client(locks)

Coordinatorrequest

Replica

internalrequest

hint X

Why Locking Doesn’t Work

• Client locks•Write times out• Lock released•Hint is replayed!!

Thursday, October 3, 13

Page 8: Cassandra 2.0   better, faster, stronger

Client(locks)

Coordinatorrequest

Replica

internalrequest

hint

timeoutresponse

X

Why Locking Doesn’t Work

• Client locks•Write times out• Lock released•Hint is replayed!!

Thursday, October 3, 13

Page 9: Cassandra 2.0   better, faster, stronger

Paxos• Consensus algorithm• All operations are quorum-based• Each replica sends information about unfinished operations to the leader

during prepare• Paxos made Simple

Thursday, October 3, 13

Page 10: Cassandra 2.0   better, faster, stronger

LWT: details• 4 round trips vs 1 for normal updates• Paxos state is durable• Immediate consistency with no leader election or failover• ConsistencyLevel.SERIAL• http://www.datastax.com/dev/blog/lightweight-transactions-in-

cassandra-2-0

Thursday, October 3, 13

Page 12: Cassandra 2.0   better, faster, stronger

UPDATE USERS SET email = ’[email protected]’, ...WHERE username = ’jbellis’IF email = ’[email protected]’;

INSERT INTO USERS (username, email, ...)VALUES (‘jbellis’, ‘[email protected]’, ... )IF NOT EXISTS;

Using LWT

• Don’t overwrite an existing record

• Only update record if condition is met

Thursday, October 3, 13

Page 13: Cassandra 2.0   better, faster, stronger

Triggers

CREATE TRIGGER <name> ON <table> USING <classname>;

DROP TRIGGER <name> ON [<keyspace>.]<table>;

• Executed on the coordinator before mutation• Takes original mutation and adds any new• Jars deployed per server

Thursday, October 3, 13

Page 14: Cassandra 2.0   better, faster, stronger

Trigger implementationclass MyTrigger implements ITrigger{ public Collection<RowMutation> augment(ByteBuffer key, ColumnFamily update) { ... }}

• You have to implement your own ITrigger (for now)• Compile and deploy to each server

Thursday, October 3, 13

Page 15: Cassandra 2.0   better, faster, stronger

Experimental!• Relies on internal RowMutation, ColumnFamily classes•Not sandboxed. Be careful!• Expect changes in 2.1

Thursday, October 3, 13

Page 16: Cassandra 2.0   better, faster, stronger

CQL Improvements• ALTER DROP• Remove a field from a CQL table.

• Conditional schema changes• Only execute if condition met

CREATE KEYSPACE IF NOT EXISTS ksWITH replication = { 'class': 'SimpleStrategy','replication_factor' : 3 };

CREATE TABLE IF NOT EXISTS test (k int PRIMARY KEY);

DROP KEYSPACE IF EXISTS ks;

ALTER TABLE users DROP address3;

Thursday, October 3, 13

Page 17: Cassandra 2.0   better, faster, stronger

CQL Improvements• Aliases in SELECT

• Limit and TTL in prepared statements

SELECT event_id, dateOf(created_at) AS creation_date, blobAsText(content) AS content FROM timeline;

event_id | creation_date | content-------------------------+--------------------------+---------------------- 550e8400-e29b-41d4-a716 | 2013-07-26 10:44:33+0200 | Something happened!?

SELECT * FROM myTable LIMIT ?;

UPDATE myTable USING TTL ? SET v = 2 WHERE k = 'foo';

Thursday, October 3, 13

Page 18: Cassandra 2.0   better, faster, stronger

Cassandra 2.0 - Minor features

Thursday, October 3, 13

Page 19: Cassandra 2.0   better, faster, stronger

Query performance •Hint when reading time series data• Time series slices find data faster

•Hybrid approach to Leveled Compaction under stress• Use size tiered until we catch up• Reduce read latency impact

• Off-heap memory speedup• Bytes moved on and off 10x faster

• Removal of row-level bloom filtersThursday, October 3, 13

Page 20: Cassandra 2.0   better, faster, stronger

Server performance• Single pass compaction• No more incremental compaction for large storage rows

• LMAX Disruptor on Thrift interface• Crazy fast and efficient concurrent threads. Faster HSHA

• Support for pluggable off-heap memory allocators• JEMalloc support to start. Faster memory access.

• Bigger Level 0 file size• 5M was just too small. Now 160M

Thursday, October 3, 13

Page 21: Cassandra 2.0   better, faster, stronger

Removed features• SuperColumns are gone!• Not the API just the underlying implementation

• On-heap row cache• Row cache is no longer an option in the JVM

•Memory pressure relief valves - Gone from yaml• flush_largest_memtables_at

• reduce_cache_sizes_at

• reduce_cache_sizes_to

Thursday, October 3, 13

Page 22: Cassandra 2.0   better, faster, stronger

Operation Changes• JDK 7 now required

• Vnodes are default

• Streaming overhaul• Control. Streams are grouped and broken into plans• Traceability. Each stream has an ID. Monitor each stream.

• Performance. Streams are now pipelined. No waiting for ACK

Thursday, October 3, 13

Page 23: Cassandra 2.0   better, faster, stronger

Thank you!

Apache Cassandra 2.0 - Data model on fire

Next talk in my data model series!

Thursday, October 3, 13

Page 24: Cassandra 2.0   better, faster, stronger

©2013 DataStax Confidential. Do not distribute without consent. 21Thursday, October 3, 13