Exploring NoSQL and implementing through Cassandra
-
Upload
dileep-kalidindi -
Category
Technology
-
view
159 -
download
0
Transcript of Exploring NoSQL and implementing through Cassandra
![Page 1: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/1.jpg)
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.1
DILEEP KALIDINDI23rd February 2015
Explore, Build & Operate
NoSQL with
Apache Cassandra
![Page 2: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/2.jpg)
Who am I ?
Dileep Varma Kalidindi
Current: Senior Engineer @Responsys (since Apr’14), Circles Team.
Fascination: Problem Solving , Distributed & BigData churning systems.
Past: 8+yrs with VeriSign, Informatica Labs, NTT Data.
Hobbies: Adventure sports.
![Page 3: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/3.jpg)
05/02/2023
Are we good ?
3
Data
![Page 4: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/4.jpg)
Data
Data has never been in same structure, so as their modelling techniques.
Applications evolved from OLAP, OLTP to Web, Mobile & Social.
Big Data comes with different characteristics – Volume, Velocity, Variety, Veracity & Value.
Responsys Data:
Need for better suitable Data models and Storage models
- but why ?
![Page 5: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/5.jpg)
![Page 6: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/6.jpg)
Impending Mismatch –Data model & Storage model
SQL relational model is User oriented
in store concurrency, integrity, consistency, or data type validity
Transactional guarantees, schemas and referential integrity
Purpose applications tend to control integrity and validity (not aggregation fancy)
Difference between the persistent data model and the in-memory data structures.
Data duplication and denormalization are now First class citizens !!
Scale–up to Scale– wide – NoSQL Multinode vs RDBMS clustering.
![Page 7: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/7.jpg)
Conceptual – ACID, BASE & CAP
Transactions, consistency and availability – could we prioritize ?
![Page 8: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/8.jpg)
CAP theorem - consequences
![Page 9: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/9.jpg)
![Page 10: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/10.jpg)
Agenda
NoSQL NoSQL Implementations – for various purposes Architecture fit – Polyglot persistence Data modelling – concepts in view of NoSQL . Cassandra – Architecture Database Internals CQL & DEMO Installation, Configuration & tools Oracle NoSQL – pitch by Sheetal
![Page 11: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/11.jpg)
# NoSQL
May 2, 2023 11
![Page 12: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/12.jpg)
NoSQL
Non-relational, distributed, open-source & horizontally scalable #nxtGen
NoSQL is an accidental neologism.
Schema less storage systems built for 5 v’s of Bigdata.
Decentralized – Every node in cluster is identical
High Availability - No SPoF – No Network failures
Open source and No cost models (Except for enterprise support)
![Page 13: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/13.jpg)
![Page 14: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/14.jpg)
NoSQL – Architecture fit-in
Polyglot persistence thinking fits in right data store for appropriate data sets.
Service usage over Direct data usage.Concerns
Operational concerns like licensing, support, tools, upgrade, auditing. Security of Datastore, Context’s, Authorization etc .. Integration with ETL and Data transfer utilities. Deployment complexity
![Page 15: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/15.jpg)
Data models – in view of NoSQL
NoSQL models are application specific “What questions do I have?”
Relational models are driven by structure of data “What answers do I have?”
Modelling techniques Conceptual: Denormalization, Aggregates & Application side joins General: Atomic aggregates, Enumerable Keys, Dimensionality
reduction, Index table & Composite key index. Hierarchical: Tree aggregation, Materialized paths, Nested sets &
batch graph processing.
![Page 16: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/16.jpg)
Data models – deep view
Conceptual: DeNormalization Query data volume or IO per query VS total data volume
Processing complexity VS total data volumeAggregates:
Simple Atomic
Tree aggregation:
![Page 17: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/17.jpg)
![Page 18: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/18.jpg)
NoSQL - implementations
If one implementation fits all then why not RDBMS ?Classification is driven in application point of view !Key-Value
Strong aggregation which is opaque to the database Oracle NoSQL, Windows Azure & Redis
Document database Structure in the aggregate MongoDb, CouchDb & Raven DB
![Page 19: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/19.jpg)
NoSQL - implementations
Column family structures Two level aggregate structure Key & a row aggregate, Row aggregate is a group of columns. Big table, Hbase & Cassandra
Graphs database Neo 4j
![Page 20: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/20.jpg)
NoSQL – implementations – CAP fit
![Page 21: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/21.jpg)
May 2, 2023 21
![Page 22: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/22.jpg)
Apache Cassandra - Continuous availability, linear scalability & operational simplicity
About Column store NoSQL Database. Originally developed by Facebook (2007) and now an Apache project Master less architecture with all nodes in Ring topology Commercial add-ons & support (“enterprise edition”) by Datastax
Data center replication, Scalability (wide), Fault-tolerance & Tunable consistency.
Online load balancing, flexible schema, key-oriented queries & CAP-aware Implementation of good Security standards, Operations, Monitoring & utilities.
![Page 23: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/23.jpg)
Column – Key-value pair Counter column Expiring column Super column
Column family – Collection of rows - Map <RowKeys, OrderedColumn Collection> Dynamic (Wide) Static (Narrow)
KeyStore – containts column families & super column familes
Cassandra – data model
![Page 24: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/24.jpg)
CAP Values – AP (Availability & Partition tolerance). Consistency (eventual) available with latency. No row locking (Hbase wins!)
Linear scaling of Cassandra – throughput vs no-of nodes. Casandra Cluster – Partioner generates tokens for rowKeys Write in action Read in action
Cassandra – Architecture
![Page 25: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/25.jpg)
Installation & Configuration
Yum installation is the easiest - /etc/yum.repos.d/datastax.repo Cassandra.yaml configuration
Cluster_name, data_file_dir, commitlog_dir Directory locations Start Cassandra :– Cassandra –f
Start CLI:- cqlsh Stop Cassandra – service stop or process kill
![Page 26: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/26.jpg)
Demo
May 2, 2023 26
![Page 27: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/27.jpg)
CQL in action
CQL 3.0 is much like SQL. All names are case-insensitive
CQL Data types: Create KeySpace: Responsys_Demo Create table, index, user All other SQL like functions !!
![Page 28: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/28.jpg)
Cassandra – Monitoring
JMX Interface – DEMO Nodetool – Cassandra JMX interface
cfstats Netstats Ring & other operations
DataStax Ops center Nagios monitoring Cassandra logging & GC logging
![Page 29: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/29.jpg)
05/02/2023
29Confidential
Summary, Conclusions&
References
![Page 30: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/30.jpg)
Summary – Quick recap
Data evolution ACID, BASE & CAP NoSQL, data models, implementations Cassandra & Data model Architecture Installations & Operations
![Page 31: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/31.jpg)
Links & References• https://highlyscalable.wordpress.com/2012/03/01/nosql-data-modeling-techniques/ • http://www.thoughtworks.com/insights/blog/nosql-databases-overview • http://www.dia.uniroma3.it/~torlone/bigdata/L6-NoSQL.pdf • https://highlyscalable.wordpress.com/2012/03/01/nosql-data-modeling-techniques/ • http://radar.oreilly.com/2013/03/returning-transactions-to-distributed-data-stores.html
![Page 32: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/32.jpg)
05/02/2023
32
Q & A
![Page 33: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/33.jpg)
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.33
Thank you
![Page 34: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/34.jpg)
![Page 35: Exploring NoSQL and implementing through Cassandra](https://reader036.fdocuments.in/reader036/viewer/2022062523/58ec0c711a28ab26268b45ff/html5/thumbnails/35.jpg)
APPENDIX