Introduciton to Apache Cassandra for Java Developers (JavaOne)

download Introduciton to Apache Cassandra for Java Developers (JavaOne)

If you can't read please download the document

Transcript of Introduciton to Apache Cassandra for Java Developers (JavaOne)

PowerPoint Presentation

Apache Cassandra

An Introduction for Java Developers

Nate [email protected]@zznate

What is Apache Cassandra?

CAP Theorem

ConsistencyAvailability Partition Tolerance

Though shalt have but 2

- Conjecture made by Eric Brewer in 2000- Published as formal proof in 2002- See: http://en.wikipedia.org/wiki/CAP_theorem for more

Apache Cassandra Concepts

- Explicit choice of partition tolerance and availability. Consistency is tunable.- No read before write- Merge on read- Idempotent- Schema Optional- All nodes share the same roll- Still performs well with larger-than-memory data sets

Generally complements another system(s)

(Not intended to be one-size-fits-all)

*** You should always use the right tool for the right job anyway

How does this differ from an RDBMS?

How does this differ from an RDBMS?

Substantially.

vs. RDBMS - No Joins

Unless: - you do them on the client - you do them via Map/Reduce

vs. RDBMS - Schema Optional

(Though you can add meta information for validation and type checking)

*** Supports secondary indexes too: WHERE state = 'TX'

vs. RDBMS - Prematerialized and Transaction-less

- No ACID transactions - Limited support for ad-hoc queries

vs. RDBMS - Prematerialized and Transaction-less

- No ACID transactions - Limited support for ad-hoc queries

*** You are going to give up both of these anyway when you shard an RDBMS ***

vs. RDBMS - Facilitates Consolidation

It can be your caching layer * Off-heap cache (provided you install JNA)

It can be your analytics infrastructure * true map/reduce * pig driver * hive driver coming soon

vs. RDBMS - Shared-Nothing Architecture

Every node plays the same role: no masters, no slaves, no special nodes

*** No single point of failure

vs. RDBMS - Real Linear Scalability

Want 2x performance? Add 2x nodes.

*** 'No downtime' included!

vs. RDBMS - Performance

Reads on par with writes

Clustering

Clustering

Single node cluster (easy development setup)- one node owns the whole hash range

Clustering

Two node cluster- Key range divided between nodes

Clustering

Consistent Hashing: md5(zznate) = C

Clustering

Consistent Hashing FTW:- Ring ownership continuously gossiped between nodes- Any node can act as a coordinator to service client requests for any key * requests forwarded to the appropriate nodes by coordinator transparently to the client

Clustering

Client Read: get(zznate)md5 = C

Clustering Scale Out

Clustering Scale Out

Clustering Scale Out

Clustering - Multi-DC

Clustering - Reliability

Clustering - Reliability

Clustering - Reliability

Clustering - Reliability

Clustering - Multi-Datacenter

Clustering Multi-DC Reliability

Storage (Briefly)

Storage (Briefly)

Understanding the on-disk format is extremely helpful in designing your data model correctly

Storage - SSTable

- SSTables are immutable (Merge on read)- Newest timestamp wins

Storage Compaction

Merge SSTables keeping count down making Merge on Read more efficientDiscards Tombstones (more on this later!)

Data Model

Data Model

"...sparse, persistent, distributed, multi-dimensional sorted map."

(The Bigtable paper)

Data Model

Keyspace- Collection of Column Families

- Controls replication

Column Family

- Similar to a table

- Columns ordered by name

Data Model Column Family

Static Column Family- Model my object data

Dynamic Column Family

- Pre-calculated query results

Nothing stopping you from mixing them!

Data Model Static CF

zznate

driftx

thobbs

jbellis

password: *

password: *

password: *

name: Nate

name: Brandon

name: Tyler

password: *

name: Jonathan

site: datastax.com

Users

Data Model Prematerialized Query

Following

zznate

driftx

thobbs

jbellis

driftx:

thobbs:

driftx:

thobbs:

mdennis:

zznate

zznate:

pcmanus

xedin:

Data Model Prematerialized Query

Additional examples:Timeline of tweets by a userTimeline of tweets by all of the people a user is followingList of comments sorted by scoreList of friends grouped by state

API Operations

Five general categories

RetrievingWriting/Updating/Removing (all the same op!)Increment counters

Meta InformationSchema ManipulationCQL Execution

Using a Client

Hector Client:http://hector-client.org- Most popular Java client - In use at very large installations- A number of tools and utilities built on top- Very active community- MIT Licensed

*** like any open source project fully dependent on another open source project it has it's worts

Sample Project for Experimenting

https://github.com/zznate/cassandra-tutorialhttps://github.com/zznate/hector-examplesBuilt using Hector Really basic designed to be beginner level w/ very few moving partsModify/abuse/alter as needed

*** Descriptions of what is going on and how to run each example are in the Javadoc comments.

ColumnFamilyTemplate

Familiar, type-safe approach- based on template-method design pattern- generic: ColumnFamilyTemplate (K is the key type, N the column name type)

ColumnFamilyTemplate template = new ThriftColumnFamilyTemplate(keyspaceName, columnFamilyName, StringSerializer.get(), StringSerializer.get());

*** (no generics for clarity)

ColumnFamilyTemplate

new ThriftColumnFamilyTemplate(keyspaceName, columnFamilyName, StringSerializer.get(), StringSerializer.get());

Key Format

Column Name Format- Cassandra calls this a comparator- Remember: defines column order in on-disk format

ColumnFamilyTemplate

ColumnFamilyResult res = cft.queryColumns("zznate");

String value = res.getString("email");

Date startDate = res.getDate(startDate);

Key Format

Column Name Format

ColumnFamilyTemplate

ColumnFamilyResult wrapper = template.queryColumns("zznate", "patricioe", "thobbs");

String nateEmail = wrapper.getString("email");

wrapper.next();

String patoEmail = wrapper.getString("email");

wrapper.next(); String tylerEmail = wrapper.getString("email");

Querying multiple rows and iterating over results

ColumnFamilyTemplate

ColumnFamilyUpdater updater = template.createUpdater("zznate");

updater.setString("companyName","DataStax");updater.addKey("sergek");updater.setString("companyName","PrestoSports");

template.update(updater);

Inserting data with ColumnFamilyUpdater

ColumnFamilyTemplate

template.deleteColumn("zznate", "notNeededStuff");template.deleteColumn("zznate", "somethingElse");template.deleteColumn("patricioe", "aDifferentColumnName");...template.deleteRow(someuser);

template.executeBatch();

Deleting Data with ColumnFamilyTemplate

Deletion

Deletion

Again: Every mutation is an insert!

- Merge on read

- Sstables are immutable

- Highest timestamp wins

Deletion As Seen by CLI

[default@Tutorial] list StateCity;Using default limit of 100

-------------------

RowKey: CA Burlingame

=> (column=650, value=33372e3537783132322e3334, timestamp=1310340410528000)

-------------------

RowKey: TX Austin

=> (column=202, value=33302e3237783039372e3734, timestamp=1310143852392000)

=> (column=203, value=33302e3237783039372e3734, timestamp=1310143852444000)

=> (column=204, value=33302e3332783039372e3733, timestamp=1310143852448000)

=> (column=205, value=33302e3332783039372e3733, timestamp=1310143852453000)

=> (column=206, value=33302e3332783039372e3733, timestamp=1310143852457000)

Deletion As Seen by CLI

[default@Tutorial] list StateCity;Using default limit of 100

-------------------

RowKey: CA Burlingame

-------------------

RowKey: TX Austin

=> (column=202, value=33302e3237783039372e3734, timestamp=1310143852392000)

=> (column=203, value=33302e3237783039372e3734, timestamp=1310143852444000)

=> (column=204, value=33302e3332783039372e3733, timestamp=1310143852448000)

=> (column=205, value=33302e3332783039372e3733, timestamp=1310143852453000)

=> (column=206, value=33302e3332783039372e3733, timestamp=1310143852457000)

Deletion FYI

mutator.addDeletion("202230", "Npanxx", city, stringSerializer);

Does not exist? You just inserted a tombstone!

Sending a deletion for a non-existing row:

[default@Tutorial] list Npanxx; Using default limit of 100

. . .

-------------------

RowKey: 202230

-------------------

. . .

Integrating with existing patterns

Integrating with existing patterns

Yes.

Integrating with existing patterns

Integrating with existing patterns

Hector Object Mapper:

https://github.com/rantav/hector/wiki/Hector-Object-Mapper-%28HOM%29

Hector JPA:

https://github.com/riptano/hector-jpa

Integrating with existing patterns

CQL: JDBC Driver and Pool in 1.0!

JdbcTemplate FTW!

Development Resources

Hector Documentation
http://hector-client.orgCassandra Maven Plugin
http://mojo.codehaus.org/cassandra-maven-plugin/CCM localhost cassandra cluster
https://github.com/pcmanus/ccmOpsCenter
http://www.datastax.com/products/opscenter

Cassandra AMIs
https://github.com/riptano/CassandraClusterAMI

Putting it Together

Take control of consistency

If you do need a high degree of consistency, use thresholds to trigger different behavior

- Bank account:

on values over $10,000, wait to here from all replicas

- Distributed Shopping Cart:

Show a confirmation page to verify order resolution

*** What is your appetite for risk?

Uniquely identify operations in the application

Facilitates idempotent behavior and out-of-order execution

Denormalization

The point of normalization is to avoid update anomalies

***But In an append-only system, we don't do updates

Summary

- Take advantage of strengths

- Look for idempotence and asynchronicity in your business processes

- If it's not in the API, you are probably doing it wrong

- Seek death is still possible if you model incorrectly

Questions

Nate [email protected]@zznate

Additional Resources

DataStax Documentation: http://www.datastax.com/docs/0.8/index

Apache Cassandra project wiki: http://wiki.apache.org/cassandra/

The Dynamo Paper

http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf

P. Helland. Building on Quicksand

http://arxiv.org/pdf/0909.1788

P. Helland. Life Beyond Distributed Transactions

http://www.ics.uci.edu/~cs223/papers/cidr07p15.pdf

S. Anand. Netflix's Transition to High-Availability Storage Systems

http://media.amazonwebservices.com/Netflix_Transition_to_a_Key_v3.pdf

The Megastore Paper

http://research.google.com/pubs/archive/36971.pdf