Cassandra - Say Goodbye to the Relational Database (5-6-2010)

35
Cassandra Say Goodbye to the Relation Database Twin Cities PHP User Group May 6, 2010 Chris Barber CB1, INC. http://www.cb1inc.com/ v

Transcript of Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Page 1: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

CassandraSay Goodbye to the Relation Database

Twin Cities PHP User GroupMay 6, 2010

Chris BarberCB1, INC.

http://www.cb1inc.com/

v

Page 2: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

About Me

● Chris Barber

● Open source hacker

● Software consultant

● JavaScript, C++, PHP

● http://www.cb1inc.com/

● http://twitter.com/cb1inc

● http://twitter.com/cb1kenobi

● http://slideshare.net/cb1kenobi

Page 3: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

What is Cassandra?

Page 4: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

A highly scalable, eventually consistent, distributed,

structured key-value store.

Page 5: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

About Cassandra

● Started by Facebook

● Open Source● Apache Project

● Apache License 2.0

● Written in Java

● Mutli-platform

● Current Version 0.6.1

● http://cassandra.apache.org/

Page 6: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

Who's using Cassandra?

Page 7: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

Page 8: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

Cassandra Internals

Page 9: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

Cassandra Overview

● Like a big hash table of hash tables

● Column Database (schemaless)

● Highly scalable● Add nodes in minutes

● Fault tolerant

● Distributed

● Tunable

Page 10: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

Dynamo + BigTable = Cassandra

● Amazon Dynamo● Cluster management

● Replication

● Fault tolerance

● Google BigTable● Sparse

● Columnar data model

● Storage architecture

Page 11: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

Pros & Cons

● Pros● Easy to scale

● No single point of failure

● High write-through

● Handles lots of data

● Durable

● No more SQL injection

● Cons● No joins

● Index & sort keys only

● Not good for large blobs

● Rows must fit in memory

● Built on Thrift

Page 12: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

CAP Theorem

● CAP Theorem● Consistency

● Availability

● Partitioning

● You can only have 2

● Cassandra is Available and Partitioning● Eventually consistent

– Can be defined on a per request basis

Page 13: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

Consistency

● Specified for each operation● Zero

● One

● Quorum (N-1)

● All

● Any

Page 14: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

Replication Ring

● Ring of servers

● Talk to each other using "gossip"

● Data distributed between nodes

● Uses "tokens" to partition data● Must be unique per node

Page 15: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

Partitioning

● RandomPartitioner● Inefficient range queries

● Doesn't sort properly

● OrderPreservingPartitioner● Can cause unevenly distributed data

● Stores data sorted

Page 16: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

Replica Placement Strategy

● Rack-unware● Default

● Rack-aware● Place one replica in a different datacenter, and the

others on different racks in the same one

Page 17: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

Data Model

● Keyspace

● Column Family (standard or super)

● Columns & Super Columns

● Keys and column names

Keyspace1: { users: { "cb1kenobi": { "FirstName": "chris", "LastName": "barber" } }}

Page 18: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

Installing & Deploying Cassandra

Page 19: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

Getting Cassandra

● http://cassandra.apache.org/download/ ● http://www.apache.org/dist/cassandra/0.6.1/apache-cassandra-0.6.1-bin.tar.gz

● http://www.apache.org/dist/cassandra/0.6.1/apache-cassandra-0.6.1-src.tar.gz

● svn checkout https://svn.apache.org/repos/asf/cassandra/trunk cassandra

● git clone git://git.apache.org/cassandra.git

Page 20: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

Getting Cassandra

● http://cassandra.apache.org/download/ ● http://www.apache.org/dist/cassandra/0.6.1/apache-cassandra-0.6.1-bin.tar.gz

● http://www.apache.org/dist/cassandra/0.6.1/apache-cassandra-0.6.1-src.tar.gz

● svn checkout https://svn.apache.org/repos/asf/cassandra/trunk cassandra

● git clone git://git.apache.org/cassandra.git

Page 21: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

Installing Cassandra

sucd /usr/localwget http://www.apache.org/dist/cassandra/0.6.1/apache-cassandra-0.6.1-src.tar.gztar xzf apache-cassandra-0.6.1-src.tar.gzmkdir -p /var/log/cassandrachown -R `whoami` /var/log/cassandramkdir -p /var/lib/cassandrachown -R `whoami` /var/lib/cassandracd apache-cassandra-0.6.1-srcant

Page 22: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

Configuration

● Main config file● conf/storage-conf.xml

● Keyspaces

● Partitioner

● AutoBootstrap

● Authentication method

● Buffer sizes

● Timeouts

Page 23: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

Automatically Start Cassanrda

useradd -G cassandra cassandra

<editor of choice> /etc/init.d/cassandra# paste contents of next slide

chmod +x /etc/init.d/cassandra

# Ubuntu/Debian method:update-rc.d -f cassandra defaults# Red Hat/Fedora method: use chkconfig

Page 24: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

Automatically Start Cassandra

#!/bin/bashexport JAVA_HOME=/usr/bin/javaexport CASSANDRA_HOME=/usr/local/apache-cassandra-0.6.1-srcexport CASSANDRA_INCLUDE=$CASSANDRA_HOME/bin/cassandra.in.shexport CASSANDRA_CONF=$CASSANDRA_HOME/conf/storage-conf.xmlexport CASSANDRA_OWNR=cassandraexport PATH=$PATH:$CASSANDRA_HOME/binlog_file=/var/log/cassandra/stdoutpid_file=/var/run/cassandra/pid_file

if [ ! -f $CASSANDRA_HOME/bin/cassandra -o ! -d $CASSANDRA_HOME ]then echo "Cassandra startup: cannot start" exit 1fi

mkdir -p /var/run/cassandrachown cassandra:cassandra /var/run/cassandra

case "$1" in start) # Cassandra startup echo -n "Starting Cassandra: " su $CASSANDRA_OWNR -c "$CASSANDRA_HOME/bin/cassandra -p $pid_file" > $log_file 2>&1 echo "OK" ;; stop) # Cassandra shutdown echo -n "Shutdown Cassandra: " su $CASSANDRA_OWNR -c "kill `cat $pid_file`" echo "OK" ;; reload|restart) $0 stop $0 start ;; status) ;; *) echo "Usage: `basename $0` start|stop|restart|reload" exit 1esac

exit 0

Page 25: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

Running Cassandra

● Manually start● bin/cassandra -f

● Command line app● bin/cassandra-cli --host localhost --port 9160

● Nodetool● bin/nodetool -h localhost info

● Many more commands: ring, cleanup, cfstats, etc

20146078924586773365182178806181105130Load : 274.66 KBGeneration No : 1273183803Uptime (seconds) : 121Heap Memory (MB) : 51.84 / 1023.88

Page 26: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

PHP Clients

Page 27: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

PHP Clients

● Thrift

● Pandra (LGPL)

● PHP Cassa – pycassa port

● Simple Cassie (New BSD License)

● Prophet (PHP License)

Page 28: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

PHP Thrift Client

● Thrift files● Thrift.php

● protocol/TBinaryProtocol.php

● protocol/TProtocol.php

● transport/TBufferedTransport.php

● transport/TFramedTransport.php

● transport/THttpClient.php

● transport/TMemoryBuffer.php

● transport/TNullTransport.php

● transport/TPhpStream.php

● transport/TSocket.php

● transport/TSocketPool.php

● transport/TTransport.php

● Thrift generated PHP files● thrift --gen php cassandra.thrift

– cassandra_constants.php

– Cassandra.php

– cassandra_types.php

● Use thrift_protocol native PHP extension

Page 29: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

PHP Thrift Client Example

<?php$GLOBALS['THRIFT_ROOT'] = './thrift';require $GLOBALS['THRIFT_ROOT'] . '/Thrift.php';require $GLOBALS['THRIFT_ROOT'] . '/transport/TSocket.php';require $GLOBALS['THRIFT_ROOT'] . '/transport/TBufferedTransport.php';require $GLOBALS['THRIFT_ROOT'] . '/protocol/TBinaryProtocol.php';require $GLOBALS['THRIFT_ROOT'] . '/packages/cassandra/Cassandra.php';

$socket = new TSocket('127.0.0.1', 9160);$transport = new TBufferedTransport($socket, 1024, 1024);$protocol = new TbinaryProtocolAccelerated($transport);$client = new CassandraClient($protocol);

$transport->open();

$columnPath = new cassandra_ColumnPath();$columnPath->column_family = 'Standard1';$columnPath->super_column = null;$columnPath->column = 'firstname';

$client->insert('Keyspace1', 'mykey', $columnPath, 'Chris', time(), cassandra_ConsistencyLevel::ONE);

$name = $client->get('Keyspace1', 'mykey', $columnPath, cassandra_ConsistencyLevel::ONE);var_dump($name);

$transport->close();

Page 30: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

Prophet PHP Extension

● C++ PHP Extension

● Built on top of Thrift C library

● Very, very, very far from usable/working/complete

● Goals● Speed!

● Full API support

● CRUD/ORM magic

● Serialization helper

● Developed for PHP 5.3, Linux, non-threaded (i.e. FastCGI)

● http://github.com/cb1kenobi/prophet

Page 31: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

Cassandra Roadmap

Page 32: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

Roadmap 0.7 & Beyond

● SSTable compression

● Live keyspace & column family changes

● Vector clock support

● Truncate support

● Range delete

● byte[] keys

● Memory efficient compactions

● Apache Avro

● Multi-tenant support* Taken from other presentations

Page 33: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

Resources

● Cassandra Wiki● http://wiki.apache.org/cassandra/

● IRC● #cassandra on irc.freenode.net

● Cassandra Users Mailing List● [email protected]

● Follow people on Twitter● @cassandra

● @spyced

● @b6n

● @jericevans

● @riptano

Page 34: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

Getting Help

CB1, INChttp://www.cb1inc.com/

Web ApplicationsOpen Source Solutions

Page 35: Cassandra - Say Goodbye to the Relational Database (5-6-2010)

Minnesota PHP User Group | 5.6.2010 | Chris Barber | CB1, INC. | http://www.cb1inc.com/

Thanks!

Questions?

http://www.cb1inc.com/http://twitter.com/cb1inc

http://slideshare.net/cb1kenobi http://twitter.com/cb1kenobi