Apachecon cassandra transport

96
Apache Cassandra Clients and Transports Thursday, February 28, 13

description

Apache Cassandra client and transport architecture. Slides from ApacheCon NA 2013 in Portland, OR.

Transcript of Apachecon cassandra transport

Page 1: Apachecon cassandra transport

Apache Cassandra

Clients and Transports

Thursday, February 28, 13

Page 2: Apachecon cassandra transport

Hi Folks!I’m Nate @zznate

Thursday, February 28, 13

Page 3: Apachecon cassandra transport

API ManagementAPI AnalyticsAPI Tools

Thursday, February 28, 13

Page 4: Apachecon cassandra transport

Clients and transportsfor Cassandra

Thursday, February 28, 13

Page 5: Apachecon cassandra transport

But first... some questions

Thursday, February 28, 13

Page 6: Apachecon cassandra transport

But first: Architectural stuff

Thursday, February 28, 13

Page 7: Apachecon cassandra transport

Cassandra:“Sparsely Columnar”

Thursday, February 28, 13

Page 8: Apachecon cassandra transport

An RDBMS table

Thursday, February 28, 13

Page 9: Apachecon cassandra transport

An RDBMS table

Thursday, February 28, 13

Page 10: Apachecon cassandra transport

Cassandra Style

Thursday, February 28, 13

Page 11: Apachecon cassandra transport

Cassandra Style

Thursday, February 28, 13

Page 12: Apachecon cassandra transport

Cassandra data modelling

Thursday, February 28, 13

Page 13: Apachecon cassandra transport

Four common patternsSimple object to simple rowSparse object to rowsMaterialized viewManual index

Thursday, February 28, 13

Page 14: Apachecon cassandra transport

simple objects to simple row

Thursday, February 28, 13

Page 15: Apachecon cassandra transport

“static” column family

Thursday, February 28, 13

Page 16: Apachecon cassandra transport

Sparse Objects

Thursday, February 28, 13

Page 17: Apachecon cassandra transport

“dynamic” column family

Thursday, February 28, 13

Page 18: Apachecon cassandra transport

Materialized Views

Thursday, February 28, 13

Page 19: Apachecon cassandra transport

Materialized view

Thursday, February 28, 13

Page 20: Apachecon cassandra transport

Regardless of the approached used, there are four overall goals

Thursday, February 28, 13

Page 21: Apachecon cassandra transport

1. Denormalize2. Eliminate seeks3. Design for read4. Optimiza for blind writes

Thursday, February 28, 13

Page 22: Apachecon cassandra transport

Now... let’s talk about protocols

Thursday, February 28, 13

Page 23: Apachecon cassandra transport

Thrift

Thursday, February 28, 13

Page 24: Apachecon cassandra transport

ThriftRPC-Based

Thursday, February 28, 13

Page 25: Apachecon cassandra transport

ThriftRPC-BasedMature Apache Project

Thursday, February 28, 13

Page 26: Apachecon cassandra transport

ThriftRPC-BasedMature Apache ProjectSupports lots of languages

Thursday, February 28, 13

Page 27: Apachecon cassandra transport

ThriftRPC-BasedMature Apache ProjectSupports lots of languagesExtensible!

Thursday, February 28, 13

Page 28: Apachecon cassandra transport

CQL

Thursday, February 28, 13

Page 29: Apachecon cassandra transport

CQLWell defined protocol

Thursday, February 28, 13

Page 30: Apachecon cassandra transport

CQLWell defined protocolSupports Compression

Thursday, February 28, 13

Page 31: Apachecon cassandra transport

CQLWell defined protocolSupports CompressionNetty/NIO-based

Thursday, February 28, 13

Page 32: Apachecon cassandra transport

Storage Mechanics(but quickly)

Thursday, February 28, 13

Page 33: Apachecon cassandra transport

get_slice

Workhorse of Cassandra selection methods

Thursday, February 28, 13

Page 34: Apachecon cassandra transport

get_slice: key

The row key

Thursday, February 28, 13

Page 35: Apachecon cassandra transport

get_slice: ColumnParent

The column family (a.k.a table)

Thursday, February 28, 13

Page 36: Apachecon cassandra transport

get_slice: SlicePredicate

defines the column range, or specifically named columns

Thursday, February 28, 13

Page 37: Apachecon cassandra transport

get_slice:ConsistencyLevel

The level of consistency we want for this read

Thursday, February 28, 13

Page 38: Apachecon cassandra transport

Obtuse at first glance, but nothing is hidden

Thursday, February 28, 13

Page 39: Apachecon cassandra transport

So...

Thursday, February 28, 13

Page 40: Apachecon cassandra transport

But one person’s abstraction leakage is another’s preffered model

Thursday, February 28, 13

Page 41: Apachecon cassandra transport

How closely do you want to interact with the underlying storage engine?

Thursday, February 28, 13

Page 42: Apachecon cassandra transport

Client APIs

Thursday, February 28, 13

Page 43: Apachecon cassandra transport

Benefits of thrift

Thursday, February 28, 13

Page 44: Apachecon cassandra transport

Benefits of thriftMature selection of clients

Thursday, February 28, 13

Page 45: Apachecon cassandra transport

Benefits of thriftMature selection of clientsMultiple languages

Thursday, February 28, 13

Page 46: Apachecon cassandra transport

Benefits of thriftMature selection of clientsMultiple languagesWell documented

Thursday, February 28, 13

Page 47: Apachecon cassandra transport

Benefits of thriftMature selection of clientsMultiple languagesWell documentedCan be used in other places

Thursday, February 28, 13

Page 48: Apachecon cassandra transport

Drawbacks of thrift

Thursday, February 28, 13

Page 49: Apachecon cassandra transport

Drawbacks of thriftSeveral objects are required for any request

Thursday, February 28, 13

Page 50: Apachecon cassandra transport

Drawbacks of thriftSeveral objects are required for any requestClients differs in implementation

Thursday, February 28, 13

Page 51: Apachecon cassandra transport

Drawbacks of thriftSeveral objects are required for any requestClients differs in implementationUpstream dependency issues

Thursday, February 28, 13

Page 52: Apachecon cassandra transport

Drawbacks of thriftSeveral objects are required for any requestClients differs in implementationUpstream dependency issuesSchema changes and cluster health done pro-actively

Thursday, February 28, 13

Page 53: Apachecon cassandra transport

Benefits of cql api

Thursday, February 28, 13

Page 54: Apachecon cassandra transport

Benefits of cql apiStored procedures

Thursday, February 28, 13

Page 55: Apachecon cassandra transport

Benefits of cql apiStored proceduresCommon operations are straight forward

Thursday, February 28, 13

Page 56: Apachecon cassandra transport

Benefits of cql apiStored proceduresCommon operations are straight forward Cluster health and schema change push-back

Thursday, February 28, 13

Page 57: Apachecon cassandra transport

Benefits of cql apiStored proceduresCommon operations are straight forward Cluster health and schema change push-backAwesome client available

Thursday, February 28, 13

Page 58: Apachecon cassandra transport

Drawbacks of CQL apis

Thursday, February 28, 13

Page 59: Apachecon cassandra transport

Drawbacks of CQL apisStill have idiomatic clients

Thursday, February 28, 13

Page 60: Apachecon cassandra transport

Drawbacks of CQL apisStill have idiomatic clientsStill a binary protocol

Thursday, February 28, 13

Page 61: Apachecon cassandra transport

Drawbacks of CQL apisStill have idiomatic clientsStill a binary protocolDefault storage model emposes substantial restrictions** see gotchas section later

Thursday, February 28, 13

Page 62: Apachecon cassandra transport

Considerations for your app

Thursday, February 28, 13

Page 63: Apachecon cassandra transport

Stick with Thrift if...

Thursday, February 28, 13

Page 64: Apachecon cassandra transport

Heavy update workloads

Thursday, February 28, 13

Page 65: Apachecon cassandra transport

Large, dynamic batch insertions

Thursday, February 28, 13

Page 66: Apachecon cassandra transport

Hadoop integration(CASSANDRA-4421)

Thursday, February 28, 13

Page 67: Apachecon cassandra transport

Commonly deal with very wide rows(CASSANDRA-4176)

Thursday, February 28, 13

Page 68: Apachecon cassandra transport

CASSANDRA-4176:“Pick your shard keys carefully”

Thursday, February 28, 13

Page 69: Apachecon cassandra transport

Thursday, February 28, 13

Page 70: Apachecon cassandra transport

Consider CQL if...

Thursday, February 28, 13

Page 71: Apachecon cassandra transport

Static column family model:Take advantage of stored procedures for common reads

Thursday, February 28, 13

Page 72: Apachecon cassandra transport

Despite the shard key jab, CQL makes good use of the storage model

Thursday, February 28, 13

Page 73: Apachecon cassandra transport

You can replace some custom serialization with collections

Thursday, February 28, 13

Page 74: Apachecon cassandra transport

Integration with JDBC and/or BI tools

Thursday, February 28, 13

Page 75: Apachecon cassandra transport

Wire efficient:Does not return timestamp or TTL by default

Thursday, February 28, 13

Page 76: Apachecon cassandra transport

Larger, potentially more transient evironments

Thursday, February 28, 13

Page 77: Apachecon cassandra transport

But CQL is - new- an abstraction

Thursday, February 28, 13

Page 78: Apachecon cassandra transport

In some cases, CQL might not do what you think

Thursday, February 28, 13

Page 79: Apachecon cassandra transport

Most common CQL pitfalls

Thursday, February 28, 13

Page 80: Apachecon cassandra transport

Collections can only be retrieved in their entirety

Thursday, February 28, 13

Page 81: Apachecon cassandra transport

Can’t mix static and dynamic data in a column family

Thursday, February 28, 13

Page 82: Apachecon cassandra transport

“keys only” range slices don’t work(CASSANDRA-4536)

Thursday, February 28, 13

Page 83: Apachecon cassandra transport

Range ghosts will not be returned

Thursday, February 28, 13

Page 84: Apachecon cassandra transport

Batch inserts are clunky(CASSANDRA-4693)

Thursday, February 28, 13

Page 85: Apachecon cassandra transport

With non-compact storage the whole row must be read every time.

Thursday, February 28, 13

Page 86: Apachecon cassandra transport

The take away is that you have options. Particularly good ones for Java.

Thursday, February 28, 13

Page 87: Apachecon cassandra transport

Thursday, February 28, 13

Page 88: Apachecon cassandra transport

BUT

Thursday, February 28, 13

Page 89: Apachecon cassandra transport

there is a larger, more fundamental problem to discuss

Thursday, February 28, 13

Page 90: Apachecon cassandra transport

“If [they] think that CQL is the answer to usability then I just won. We at least know where our problems are.”- 10gen exec.

Thursday, February 28, 13

Page 91: Apachecon cassandra transport

The market has spoken and we missed the boat.

Thursday, February 28, 13

Page 92: Apachecon cassandra transport

POST /endpoint {json}

Thursday, February 28, 13

Page 93: Apachecon cassandra transport

A Cassandra-MVP actually maintains a REST front-end

Thursday, February 28, 13

Page 94: Apachecon cassandra transport

So we’ve taken this and gone further

Thursday, February 28, 13

Page 95: Apachecon cassandra transport

What if...

Thursday, February 28, 13

Page 96: Apachecon cassandra transport

Coming soon...Intravert. Vert.x+Cassandra.ASF-licensed.Driven by real-world requirements.

Thursday, February 28, 13