Data stax - lightning introduction to cassandra - Marmite NoSql
NoSQL Database- cassandra column Base DB
-
Upload
sadegh-salehi -
Category
Engineering
-
view
176 -
download
9
Transcript of NoSQL Database- cassandra column Base DB
![Page 1: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/1.jpg)
+
NoSQL – Part 2CAP Theorem & Column Oriented
Mohammad Sadegh Salehi
Dr.Baraani
Winter2015 Sheikh Bahaie
University
![Page 2: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/2.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
2
Winter 2015
Agenda
— Review NoSQL
— Dynamo and BigTable
—NoSQL Classification
— Key-value Stores
— Column Oriented
—Casandra
— Why Casandra
— Question
![Page 3: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/3.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
3
Winter 2015
What is NoSQLreview
Stands for Not Only SQL
Class of non-relational data storage systems
Usually do not require a fixed table schema nor do
they use the concept of joins
All NoSQL offerings relax one or more of the ACID
properties (will talk about the CAP theorem)
![Page 4: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/4.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
4
Winter 2015
Dynamo and BigTable
Three major papers were the seeds of the NoSQL
movement
• BigTable (Google)
• Dynamo (Amazon)
—Gossip protocol (discovery and error detection)
— Distributed key-value data store
— Eventual consistency
• CAP Theorem (discuss in a sec ..)
![Page 5: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/5.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
5
Winter 2015
![Page 6: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/6.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
7
Winter 2015
What kinds of NoSQLReview
NoSQL solutions fall into two major areas:
• Key/Value or ‘the big hash table’.
— Amazon S3 (Dynamo)
— Voldemort
—Scalaris
• Schema-less which comes in multiple flavors, column-
based, document-based or graph-based.
—Cassandra (column-based)
— CouchDB (document-based)
— Neo4J (graph-based)
— HBase (column-based)
![Page 7: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/7.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
8
Winter 2015
Key-Value Stores
Extremely simple interface• Data model: (key, value) pairs• Operations:
— Insert(key,value), —Fetch(key),—Update(key), —Delete(key).
Implementation: efficiency, scalability, fault-tolerance• Records distributed to nodes based on key• Replication• Single-record transactions, “eventual consistency”
![Page 8: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/8.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
9
Winter 2015
Key-Value Data Stores
Storing Session Information User Profiles, Preferences: Almost every user has
a unique userID as well as preferences such as language, color, timezone, which products the user has access to , and so on.
Suitable Use Cases
![Page 9: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/9.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
10
Winter 2015
Key-Value Data Stores
As we want the shopping carts to be available all the time, across browsers, machines, and sessions, all the shopping information can be put into value where the key is the userID
Shopping Cart Data
![Page 10: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/10.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
11
Winter 2015
Key-Value Data Stores
Relationships among data
Multi-operation Transactions
Query by Data
Operations by Sets
Not to Use
![Page 11: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/11.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
12
Winter 2015
Column-oriented
Store data in column order Allow key-value pairs to be stored (and retrieved
on key) in a massively parallel system,
• Data model: families of attributes defined in a schema, new attributes can be added,
• Storing principle: big hashed distributed tables,
• Properties: partitioning (horizontally and/or vertically), high availability etc. completely transparent to application,
Intro
![Page 12: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/12.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
13
Winter 2015
![Page 13: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/13.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
14
Winter 2015
Cassandra
Apache Cassandra™ is a free Distributed… High performance… Extremely scalable… Fault tolerant (i.e. no single point of failure)…
Post-relational database solution.
Cassandra can serve as both real-time datastore and as a read-intensive database.
Compiles to: C++, Java, PHP, Ruby, Erlang, Perl, ...
Thrift
![Page 14: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/14.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
15
Winter 2015
CassandraInfographic
![Page 15: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/15.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
16
Winter 2015
Cassandra
Originally developed at Facebook Follows the BigTable data model: column-oriented Uses the Dynamo Eventual Consistency model Written in Java Open-sourced and exists within the Apache family Uses Apache Thrift as it’s API Some of its myriad users:
![Page 16: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/16.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
17
Winter 2015
Cassandra
keyspace: Usually the name of the application; e.g., 'Twitter', 'Wordpress‘.
column family: structure containing an unlimited number of rows• Simple• Super (nested Column Families)
column: a tuple with name, value and time stamp• Each Column has
— Name— Value— Timestamp
key: name of record super column: contains more columns
Data Model
![Page 17: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/17.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
18
Winter 2015
Cassandra – Data Model
keyspace
settings
column family
settings
column
name value timestamp
![Page 18: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/18.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
19
Winter 2015
CassandraColumn Family & Super Column Family
![Page 19: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/19.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
20
Winter 2015
Cassandra
Cassandra was designed with the understanding that system/hardware failures can and do occur
Peer-to-peer, distributed system All nodes the same Data partitioned among all nodes
in the cluster Custom data replication to ensure
fault tolerance Read/Write-anywhere design
Architecture Overview
![Page 20: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/20.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
21
Winter 2015
Cassandra
Each node communicates with each other through the Gossip protocol, which exchanges information across the cluster every second,
A commit log is used on each node to capture write activity. Data durability is assured,
Data also written to an in-memorystructure (memtable) and then to disk once the memory structure is full (an SStable).
Architecture Overview
![Page 21: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/21.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
22
Winter 2015
Why Cassandra?
Gigabyte to Petabyte scalability Linear performance gains through adding nodes No single point of failure Easy replication / data distribution Multi-data center and Cloud capable No need for separate caching layer Tunable data consistency Flexible schema design Data Compression CQL language (like SQL) Support for key languages and platforms No need for special hardware or software
![Page 22: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/22.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
23
Winter 2015
Why Cassandra?
Capable of comfortably scaling to petabytes New nodes = Linear performance increases Add new nodes online
Big Data Scalability
1
2
Double Throughput
Capabilities
1
2
3
4
![Page 23: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/23.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
24
Winter 2015
Why Cassandra?
All nodes the same Customized replication affords tunable data redundancy Read/write from any node Can replicate data among different physical data center
racks
No Single Point of Failure
![Page 24: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/24.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
25
Winter 2015
Why Cassandra?
Peer-to-peer architecture removes need for special caching layer and the programming that goes with it
The database cluster uses the memory from all participating nodes to cache the data assigned to each node
No irregularities between a memory cache and database are encountered
No Need for Caching Software
Database Server
Memcached Servers
Application ServersW
rite
s
Re
ad
s
![Page 25: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/25.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
26
Winter 2015
Why Cassandra?
Uses Google’s Snappy data compression algorithm Compresses data on a per column family level Internal tests at DataStax show up to 80%+ compression
of raw data No performance penalty (and some increases in overall
performance due to less physical I/O)!
Data Compression
Portfolio Keyspace
Customer Column Family
![Page 26: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/26.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
27
Winter 2015
Why Cassandra?
Very similar to RDBMS SQL syntax Create objects via DDL (e.g. CREATE…) Core DML commands supported: INSERT, UPDATE,
DELETE Query data with SELECT
CQL Language
Portfolio Keyspace1
2
3
4
5
6
SELECT *
FROM USERS
WHERE STATE = ‘TX’;
![Page 27: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/27.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
28
Winter 2015
Comparison with MySQL
MySQL > 50 GB Data Writes Average : ~300 msReads Average : ~350 ms
Stats provided by Authors using facebook data.
Cassandra > 50 GB DataWrites Average : 0.12 msReads Average : 15 ms
![Page 28: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/28.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
29
Winter 2015
Cassandra Tools
..\..\..\..\Desktop\noSqlCassandra-sadegh\noSqlCassandra-sadegh.mp4
![Page 29: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/29.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
30
Winter 2015
Where to get Cassandra?
Go to www.datastax.com DataStax makes free smart start installers available for
Cassandra that include: • The most up-to-date Cassandra version that is production quality• A version of DataStax OpsCenter, which is a visual, browser-
based management tool for managing and monitoring Cassandra
• Drivers and connectors for popular development languages • Same database and application• Automatic configuration assistance for ensuring optimal
performance and setup for either stand-alone or cluster implementations
• Getting Started Guide
![Page 30: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/30.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
31
Winter 2015
Where Can I Learn More?
www.datastax.com
Free Online Documentation User/Customer Cas Studies Technical White Papers Software downloads Technical Articles
User Forums Videos Tutorials FAQ’s Blogs
![Page 31: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/31.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
32
Winter 2015
ResourcesSites
Cassandra• http://cassandra.apache.org
NoSQL News websites• http://nosql.mypopescu.com• http://www.nosqldatabases.com
“a practical guide to noSQL”, Posted by Denise Miura on March 17, 2011 at • http://blogs.marklogic.com/2011/03/17/a-practical-
guide-to-nosql/
![Page 32: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/32.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
33
Winter 2015
ResourcesBooks
“Cassandra The Definition Guide”, O'Reilly Media, nov2013
“Cassandra Essential Toturial”, DataStax 2014
“Professional NoSQL”, Wrox, 2011
“NoSQL Distilled”, Martin Fowler, 2013
![Page 33: NoSQL Database- cassandra column Base DB](https://reader033.fdocuments.in/reader033/viewer/2022052413/55a6b9dd1a28abf1088b468f/html5/thumbnails/33.jpg)
+
NoSQL (part 2) - CAP Theorem & Column Oriented
33
34
Winter 2015
Questions