Web Scale with NoSQL
-
Upload
sergejus-barinovas -
Category
Technology
-
view
3.166 -
download
0
description
Transcript of Web Scale with NoSQL
Web Scale with NoSQLSergejus Barinovas (@sergejusb)
http://sergejus.blogas.lt
Who Am I?
Architect at Running NoSQL servers in production
Blogger (http://sergejus.blogas.lt, @sergejusb)
Community member (http://dotnetgroup.lt)
Contact me via [email protected]
Powered by RDBMS
Used everywhere… …even where it shouldn’t
Used for 30+ years!
Back to 1980’s…
Data boom
in numbers
600 000 000 users
30 000 servers
20+ TB raw data per day
>20 PB stored data
You really think they use RDBMS?
RDBMS Scaling Example
Simple usage
Customers
Reads / Writesmaster
Scale reads
Customers
Writes master
slave slave
Reads
Scale writes
Customers [A-M]Reads / Writes [A-M]
master
masterCustomers [N-Z]Reads / Writes [N-Z]
Reads [A-M
]
Scale reads / writes
Customers [A-M]
Writes [A-M]
master
slave slave
Reads [A-M]
masterCustomers [N-Z]
slave slave
Writes [N-Z]
Pray your system won’t fail
Enter the NoSQL
Why NoSQL
Limited SQL scalability Sharding and vertical partitioning
Limited SQL availability Master / slave configuration
Limited SQL speed of read operations Multiple read replicas
SQL limitations for huge amount of data Key / value / type columns
NoSQL history
2009, Eric Evans, no:sql(est)
NoSQL – open source distributed databases, not relational SQL databases
NoSQL – not only SQL
NoSQL → Big Data
NoSQL characteristics (1/2)
Scalability The ability to horizontally scale simple-
operation throughput over many servers
BASE A “weaker” concurrency model than the ACID
transactions in most SQL systems
NoSQL characteristics (2/2)
Distributed Efficient use of distributed indexes and RAM
for data storage
Schema-less The ability to dynamically define new
attributes or data schema
CAP theorem
2000, Eric Brewer It is impossible for a distributed computer
system to simultaneously provide all three of the following guarantees:
Consistency Availability Partition tolerance
NoSQL Databases
NoSQL categories
Key / value store
Document database
Graph database
Columnar database
Key / value store
<key, value> or Tuple<key, v1,. ., vn> Simple operations
Get Put Delete
Byte[] Byte[]
Key Value
Key / value store
Key Value
“current_date” 2023-04-08
“sergejusb” Binary Object
“sergejusb” JSON Object
Key / value stores
Redis (+)messaging (-)no shards
Voldermort
Membase (+)memcache interface
Riak
Document database
Document == complex object XML YAML JSON / BSON
Support for secondary indexes Schema can be defined at runtime Optional support for simple querying
using Map / Reduce
Document databases
MongoDB (+)shards
CouchDB (+)master / master replication
Graph database
Graph == network Basic constructs
Node Edge Properties
sergejus
sergejus.blogas.lt
tdagys
auth
ors reads
knows
knows
Graph databases
Neo4j (-)paid version required for scaling
FlockDB (+)fast (-)limited functionality
Columnar database
For HUGE amount of data
Columns are added at a runtime
Great scalability Horizontal Vertical
Columnar database
Unusual data model Key Space → Database Column Family → Table Columns and Super Columns Super Column → array of Columns Column → Tuple<Key, Value, Timestamp, TTL>
Columnar database
Simple column
Columnar database
Simple column
Columnar database
Cassandra (+)easy scalable
HBase (+)consistent (+)part of Hadoop
Hypertable
NoSQL is Cool! But…
NoSQL limitations
ORDER BY ? Natural key order
GROUP BY ? Map / Reduce*
JOIN ? Multiple Map / Reduce*
SELECT * ? Multi-machine Map / Reduce*
*if possible
NoSQL Limitations
Maturity
Tooling
Specificity
SQL vs. NoSQL
Choose the right tool for the task
You can use BOTH