A Quick Look At Cassandra

Post on 15-Jan-2015

1.345 views 3 download

Tags:

description

This from a presentation I did at the Phoenix Java User Group on 11/10/2010.

Transcript of A Quick Look At Cassandra

A Quick Look At

Bryan Williams

No SQL

History

Created at Facebook in 2007

Open Sourced in 2008

Currently version 0.6.6

Version 0.7 in Beta 3

CAP Theorem

Consistency

Availability

Partition Tolerance

Scaling

Vertical

More RAM

Faster CPU

Faster HD

Horizontal

More Servers

Shared Load

Features

Decentralized (peer to peer)

Elastic

Shared Nothing Architecture

Tuneable Consistency

Always Writeable

Optimized for excellent throughput on writes

Influences

BigTable

column family data model

High Throughput Writes

Dynamo

Hight availabilty

Scalability

Eventual Consistency (Tuneable)

Data Model

Cluster

Keyspace

Column Families

Super Columns

Columns

Cassandra’s CLI(Command Line Interface)

Secondary Indexes

Use another column family with reverse lookup

Specify Metadata on the Column Family and set the index name and type

Support coming in 0.7

Writes

Commit Logs

Memtable

SSTable

Hinted Handoff

Bloom Filter

Tombstone

Partitioning

Random Partitioner

Order Preserving Partitioner

Collating Order Preserving Partitioner

Byte Order Partitioner

Snitches

Simple Snitch

Property File Snitch

Column Sorting

AsciiType

BytesType

LexicalUUIDType

LongType

IntegerType

TimeUUIDType

UTF8Type

Custom

Replication Factor

Set per keyspace

Specified in servers config file

Indicates how many nodes you want to store a value in on every write

Consistency Level

Set per query

Specified by the client

Indicates how many nodes the client has decided must respond for a successful read/write

Based on replication factor, not on the number of nodes in the system

Write Consistency Levels

Zero: No response required

Any: 1st response from any node

One: 1st response (counting Hints)

quorum: n/2 + 1

All: All replicas must respond

Read Consistency Levels

One: The first response is taken

Quorum: N/2 + 1 replicas are required to respond

All: All replicas are required to respond

Gossiper

Protocol used for intra-ring communication

Runs every second on a timer

Used by hinted-handoff

Anti-Entropy

Replica synchronization mechanism

Ensures data on different nodes are up to date

merkle trees

Happens after each update

Read Repair

When a read operation found inconsistent data in different nodes

Timestamp for all replicas are checked

all replicas are updated based on most recent value

Weak vs Strong consistency entails whether Read Repair happens before or after returning results

Replication Strategies

Simple Strategy

Old Network Topology Strategy

Network Topology Strategy

Java Client Options

Thrift : http://incubator.apache.org/thrift

Avro : http://avro.apache.org

Hector : https://github.com/rantav/hector

Pelops : http://code.google.com/p/pelops

Kundera : http://code.google.com/p/kundera

More : http://wiki.apache.org/cassandra/ClientOptions

Cassandra: The Definitive Guide

 Author:  Eben  Hewitt

 Publisher:  Oreilly

 Release:  Late  November

Thanks For Coming

Bryan Williams

Email : Bwilliams@integrallis.com

Twitter : @BryWilliams

LINKS

http://cassandra.apache.org

http://wiki.apache.org

https://github.com/ericflo/twissandra