Introduction to nosql

21
Introduction to NoSQL

Transcript of Introduction to nosql

Introduction to NoSQL

Agenda

RDBMS & its Limitations

ACID v/s BASE

CAP Theorem

Introduction to NOSQL & its Characteristics

Types of NOSQL Databases

Choosing the right fit

Disadvantages

2

Since 1970

Use SQL to manipulate data

Easy to use

Easy to integrate with other system

Fits most of our legacy application demands

Relational DBMS

3

What is problem of RDBMS?

4

BASE

Basic Availability: Each request is guaranteed a response—successful or

failed execution

Soft state: The state of the system may change over time, at times without any input (for eventual consistency)

Eventual consistency: The database may be momentarily inconsistent but will be consistent eventually

You have to choose only two. In almost all cases, you would choose availability overconsistency

CAP Theorem

6

NoSQL (Not Only SQL) … ??

A NoSQL database provides a mechanism for storage and retrieval of data that employs less constrained consistency models than traditional relational databases.

Motivations : Simplicity of design ; Horizontal scaling ; Availability.

NoSQL databases are often highly optimized key–value stores intended for simple retrieval and appending operations, with the goal being significant performance benefits in terms of latency and throughput.

Used for : Big Data and real-time web applications.

7

Why now ??

8

Characteristic of NoSQL

Large data volumes.

Scalable replication and distribution (Horizontal scaling).

Queries need to return answers quickly.

Asynchronous Inserts & Updates.

Schema-less.

BASE / CAP Theorem.

No Joins statement.

No complicated Relationships

Less administration time(less cost).

Types of NoSQL Databases

NoSQL DB family includes several DB types:

Column: HBase, Accumulo, Cassandra

Document: MongoDB, Couchbase

Key-value : Dynamo, Riak, Redis, Cache, Project Voldemort

Graph: Neo4J, Allegro, Virtuoso

Data Model: Collection of key/value pairs

Keys and Values can be complex compounds

Designed to handle massive load

No complex query filters

All joins must be in the code

Advantages

Very fast

Very scalable

Simple model

Able to distribute horizontally

Very Predictable performance of O(1)

Disadvantages

Many data structures (objects) can't be easily modeled as key value pairs

Key/Value Databases

11

Tables are similar to RDBMS, but semi-structured

Based on Google’s BigTable

Rows can have arbitrary columns

Distributed and Decentralized

High Availability & Fault Tolerance

Tunable Consistency

Column Databases

12

Document Databases

13

Inspired by Lotus Notes

Central concept of a Document

Documents encapsulate/encode data in some Encodings: XML, YAML, JSON, BSON

Graph Database

14

Based on Graph Theory -> G = (V, E)

Designed for data that is well

represented in a graph

Social networks, public transport links, network topologies, road maps

Nodes, edges, properties are used to represent and store data

Graph relationships are query able

Which one should I choose ?

What’s best depends on your data

Priorities

What types of queries do you need to support?

How much data?

Optimized for reads, writes, or updates?

Versioning

How separate is data from app? Will other applications need to access it in future?

And how you want to interact with it

RESTful inteface

Query API

NonSQL query languages

Via indexed values, keys, nodes

File access

It too has disadvantages…

Performance and scalability achieved at the expense of feature support

No joins

Grouping and ordering become more problematic

No SQL

No transactions

Eventual consistency v/s Strict consistency

Tools are often lacking

Summary

NoSQL :

Handle huge data.

High availability with small cost.

More data redundancy.

High performance.

Less administration time.

Less standards.

SQL :

Good to solve ACID problems.

Expensive.

Less data redundancy.

Increasing availability mean increasing cost.

More standards.

More administration.

Thank You !