I’ve outgrown my basic stack. Now what?
-
Upload
francis-david-cleary -
Category
Technology
-
view
767 -
download
2
description
Transcript of I’ve outgrown my basic stack. Now what?
Hivereader.com
I’ve outgrown my basic stack. Now what?
Thoughts and feelings about growing with Django and NoSQL
Hivereader.com
Our common stack is built on:
Super awesome and fast (once you learn what knobs to turn).Lots of cool features and tools: pg_tune, pg_top, pg_bouncer
Django is a high-level Python Web framework that encourages rapid development and clean, pragmatic design.
In-memory key-value store for small chunks of data.Super simple and awesome
Hivereader.com
People are signing up and using your app/game/site/bread maker/whatever
Hivereader.com
hooray
Hivereader.com
But
Hivereader.com
Things get start to get slow
Hivereader.com
or worse
Hivereader.com
Single server
appapp
DBDB
otherother
Hivereader.com
Find your bottlenecks:
Unless your app is doing something crazyyou’re mostly abusing the DB
Add more caching or, better yet, smarter caching
Hivereader.com
Hivereader.com
pg_bouncer
Hivereader.com
Welcome to Postgres config filepg_tune is here to help*
*kind of
Hivereader.com
Still Growing?
Hivereader.com
appapp
DBDB
otherother
appapp
DBDB
appapp
DBDB
otherother
appapp
DBDB
otherother
Hivereader.com
Need a solution for lots of data that is growing quickly.
The solution needs to be targeted for my problem.
Hivereader.com
NoSQL to the rescue?
Hivereader.com
But wait, can Postgres handle this?
most likely
partitioningand
sharding
Can have lots of app code sometimes.What about when you outgrow your shard key
“Don’t shard until you have to” - every single talk I’ve seen
Slony?“master to multiple slaves” replication
Hivereader.com
Lots of options
+ tons more
Hivereader.com
Nearly all of them are based on 2 papers
Built on CAP theoremThe theorem began as a conjecture made by University of California, Berkeley computer scientist Eric Brewer at the 2000 Symposium on Principles of Distributed Computing.
In 2002, Seth Gilbert and Nancy Lynch of MIT published a formal proof of Brewer's conjecture, rendering it a theorem.
Hivereader.com
Which one?
Hivereader.com
Super fastAdvanced key-value storeThink of it as super memcached. With union math.All data must fit in ramIt is often referred to as a data structure serverKeys can contain strings, hashes, lists, sets and sorted setsNot a db solution but more of a helper.
This is now a part of our basic stack for most apps.
Hivereader.com
Document storeJSON-like documents with dynamic schemasAd-hoc queriesIndexingLoad-balancing MongoDB scales horizontally using sharding
MongoDB uses a readers-writer lock that allows concurrent reads access to a database but gives exclusive access to a single write operation.However, when a write lock exists, a single write operation holds the lock exclusively, and no other read or write operations may share the lock.
Global write lock*Uncompressed field namesSafe off by defaultJust google “mongo problems” or “moving off mongodb”
Hivereader.com
Uses pre-defined column family formatMap ReduceUsed by all people with with ‘big data’ problemsAmazing workhorse for data
You need a sizable clusterCluster setup can be difficult
/ hbase
I need to personally spend more time with this
Hivereader.com
JSON to store data, JavaScript for MapReduce and HTTP for an API. Views: embedded map/reduceMulti-master replication
BigCouch, couchbase, Membase?Kind of in a dev rut......but just pushed a new huge upgrade
Hivereader.com
Based hardcore on Amazon's Dynamo paperKey Value storeSuper good about failure, “no downtime”Map Reduce / Secondary IndexesBuilt-in full text searchLink walking
2 types of mapreduceJavascript - can be slow as hellErlang - super fast
Hivereader.com
Key value + row-oriented = column familyLinear scalability and fault-tolerance on commodity hardware or cloud infrastructureBuilt by Facebook for MessagesHas CQL3 - think SQL, kind ofBaked auto cluster AMISuper fast writesCompresses data that’s not accessed a lotCan tie in to Hadoop for big map reduce
Hivereader.com
So now what?
Hivereader.com
Things to think about:
Is eventual consistency ok for you?Do you know your queries you need right now?
Is your data complicated or simple?How fast does it grow?
How long do you want that data to hang around?Really think about trade offs.
Every system has its good and badThere is no “winner”, so stop searching “which is best”
Think about which fits your use case
http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redishttp://docs.basho.com/riak/1.2.1/references/appendices/comparisons/
Tons of links out there, just make sure they are relatively new