Post on 15-Jan-2015
description
A Quick Look At
Bryan Williams
No SQL
History
Created at Facebook in 2007
Open Sourced in 2008
Currently version 0.6.6
Version 0.7 in Beta 3
CAP Theorem
Consistency
Availability
Partition Tolerance
Scaling
Vertical
More RAM
Faster CPU
Faster HD
Horizontal
More Servers
Shared Load
Features
Decentralized (peer to peer)
Elastic
Shared Nothing Architecture
Tuneable Consistency
Always Writeable
Optimized for excellent throughput on writes
Influences
BigTable
column family data model
High Throughput Writes
Dynamo
Hight availabilty
Scalability
Eventual Consistency (Tuneable)
Data Model
Cluster
Keyspace
Column Families
Super Columns
Columns
Cassandra’s CLI(Command Line Interface)
Secondary Indexes
Use another column family with reverse lookup
Specify Metadata on the Column Family and set the index name and type
Support coming in 0.7
Writes
Commit Logs
Memtable
SSTable
Hinted Handoff
Bloom Filter
Tombstone
Partitioning
Random Partitioner
Order Preserving Partitioner
Collating Order Preserving Partitioner
Byte Order Partitioner
Snitches
Simple Snitch
Property File Snitch
Column Sorting
AsciiType
BytesType
LexicalUUIDType
LongType
IntegerType
TimeUUIDType
UTF8Type
Custom
Replication Factor
Set per keyspace
Specified in servers config file
Indicates how many nodes you want to store a value in on every write
Consistency Level
Set per query
Specified by the client
Indicates how many nodes the client has decided must respond for a successful read/write
Based on replication factor, not on the number of nodes in the system
Write Consistency Levels
Zero: No response required
Any: 1st response from any node
One: 1st response (counting Hints)
quorum: n/2 + 1
All: All replicas must respond
Read Consistency Levels
One: The first response is taken
Quorum: N/2 + 1 replicas are required to respond
All: All replicas are required to respond
Gossiper
Protocol used for intra-ring communication
Runs every second on a timer
Used by hinted-handoff
Anti-Entropy
Replica synchronization mechanism
Ensures data on different nodes are up to date
merkle trees
Happens after each update
Read Repair
When a read operation found inconsistent data in different nodes
Timestamp for all replicas are checked
all replicas are updated based on most recent value
Weak vs Strong consistency entails whether Read Repair happens before or after returning results
Replication Strategies
Simple Strategy
Old Network Topology Strategy
Network Topology Strategy
Java Client Options
Thrift : http://incubator.apache.org/thrift
Avro : http://avro.apache.org
Hector : https://github.com/rantav/hector
Pelops : http://code.google.com/p/pelops
Kundera : http://code.google.com/p/kundera
More : http://wiki.apache.org/cassandra/ClientOptions
Cassandra: The Definitive Guide
Author: Eben Hewitt
Publisher: Oreilly
Release: Late November
Thanks For Coming
Bryan Williams
Email : Bwilliams@integrallis.com
Twitter : @BryWilliams
LINKS
http://cassandra.apache.org
http://wiki.apache.org
https://github.com/ericflo/twissandra