NoSQL - West Virginia Universityjharner/courses/stat623/docs/NoSQL.pdf · 2014-11-10 · Selecting...
Transcript of NoSQL - West Virginia Universityjharner/courses/stat623/docs/NoSQL.pdf · 2014-11-10 · Selecting...
![Page 1: NoSQL - West Virginia Universityjharner/courses/stat623/docs/NoSQL.pdf · 2014-11-10 · Selecting a NoSQL Database • Key-value is used for session information, preferences, profiles,](https://reader033.fdocuments.in/reader033/viewer/2022042020/5e772cf765911048b2362a27/html5/thumbnails/1.jpg)
NoSQL Databases
![Page 2: NoSQL - West Virginia Universityjharner/courses/stat623/docs/NoSQL.pdf · 2014-11-10 · Selecting a NoSQL Database • Key-value is used for session information, preferences, profiles,](https://reader033.fdocuments.in/reader033/viewer/2022042020/5e772cf765911048b2362a27/html5/thumbnails/2.jpg)
Not Only SQLMore than one way to store data
• Not using the relational model
• Performs well on clusters
• No fixed schema
• Generally, open source
• Specialized for websites and big data
![Page 3: NoSQL - West Virginia Universityjharner/courses/stat623/docs/NoSQL.pdf · 2014-11-10 · Selecting a NoSQL Database • Key-value is used for session information, preferences, profiles,](https://reader033.fdocuments.in/reader033/viewer/2022042020/5e772cf765911048b2362a27/html5/thumbnails/3.jpg)
Why NoSQL?• Developers frustrated impedance mismatch
between relational model and application model
• Most projects use Entity-Relationship Mapping tools with a steep learning curve
• Relational databases do not run well on clusters. It would be impossible to run large scale websites (Twitter, Facebook) using a RDBMS
![Page 4: NoSQL - West Virginia Universityjharner/courses/stat623/docs/NoSQL.pdf · 2014-11-10 · Selecting a NoSQL Database • Key-value is used for session information, preferences, profiles,](https://reader033.fdocuments.in/reader033/viewer/2022042020/5e772cf765911048b2362a27/html5/thumbnails/4.jpg)
Aggregate Data Models• Modern applications need data grouped into
units for ACID purposes
• Aggregate data is easy to manage over a cluster
• Inter-aggregate relationships must be handled via map-reduce computations
• Also offer pre-computed materialized views
![Page 5: NoSQL - West Virginia Universityjharner/courses/stat623/docs/NoSQL.pdf · 2014-11-10 · Selecting a NoSQL Database • Key-value is used for session information, preferences, profiles,](https://reader033.fdocuments.in/reader033/viewer/2022042020/5e772cf765911048b2362a27/html5/thumbnails/5.jpg)
Data Distribution
• Sharding
segements the data by primary key into shards, each stored on a different server
• Two models used for distributing data across a cluster
![Page 6: NoSQL - West Virginia Universityjharner/courses/stat623/docs/NoSQL.pdf · 2014-11-10 · Selecting a NoSQL Database • Key-value is used for session information, preferences, profiles,](https://reader033.fdocuments.in/reader033/viewer/2022042020/5e772cf765911048b2362a27/html5/thumbnails/6.jpg)
Data Distribution
• Replication
copies all the data to multiple servers. Two forms:
• Master-slave
• Peer-to-peer
• Two models used for distributing data across a cluster
![Page 7: NoSQL - West Virginia Universityjharner/courses/stat623/docs/NoSQL.pdf · 2014-11-10 · Selecting a NoSQL Database • Key-value is used for session information, preferences, profiles,](https://reader033.fdocuments.in/reader033/viewer/2022042020/5e772cf765911048b2362a27/html5/thumbnails/7.jpg)
CAP Theorem
Choose any two:
• Consistency
• Availability
• Partition Toleration
C A
P
![Page 8: NoSQL - West Virginia Universityjharner/courses/stat623/docs/NoSQL.pdf · 2014-11-10 · Selecting a NoSQL Database • Key-value is used for session information, preferences, profiles,](https://reader033.fdocuments.in/reader033/viewer/2022042020/5e772cf765911048b2362a27/html5/thumbnails/8.jpg)
CAP Theorem
Consistency
All database clients will read the same value for the same query, even given concurrent updates
C A
P
![Page 9: NoSQL - West Virginia Universityjharner/courses/stat623/docs/NoSQL.pdf · 2014-11-10 · Selecting a NoSQL Database • Key-value is used for session information, preferences, profiles,](https://reader033.fdocuments.in/reader033/viewer/2022042020/5e772cf765911048b2362a27/html5/thumbnails/9.jpg)
CAP Theorem
Availability
All database clients will always be able to read and write data
C A
P
![Page 10: NoSQL - West Virginia Universityjharner/courses/stat623/docs/NoSQL.pdf · 2014-11-10 · Selecting a NoSQL Database • Key-value is used for session information, preferences, profiles,](https://reader033.fdocuments.in/reader033/viewer/2022042020/5e772cf765911048b2362a27/html5/thumbnails/10.jpg)
CAP TheoremPartition Tolerance
The database can be split into multiple servers and will continue to function if then network connection is interrupted
C A
P
![Page 11: NoSQL - West Virginia Universityjharner/courses/stat623/docs/NoSQL.pdf · 2014-11-10 · Selecting a NoSQL Database • Key-value is used for session information, preferences, profiles,](https://reader033.fdocuments.in/reader033/viewer/2022042020/5e772cf765911048b2362a27/html5/thumbnails/11.jpg)
CAP is the Key• SQL databases have a fixed capability (the
standard) and you can’t choose one feature over another
• NoSQL databases allows choice of best features for a particular system
• NoSQL requires knowledge to make a good choice
![Page 12: NoSQL - West Virginia Universityjharner/courses/stat623/docs/NoSQL.pdf · 2014-11-10 · Selecting a NoSQL Database • Key-value is used for session information, preferences, profiles,](https://reader033.fdocuments.in/reader033/viewer/2022042020/5e772cf765911048b2362a27/html5/thumbnails/12.jpg)
3 Faces of CAPCA (Consistency and Availability)
• Uses 2-phase commit
• Blocks system, so only works well in a single data center
![Page 13: NoSQL - West Virginia Universityjharner/courses/stat623/docs/NoSQL.pdf · 2014-11-10 · Selecting a NoSQL Database • Key-value is used for session information, preferences, profiles,](https://reader033.fdocuments.in/reader033/viewer/2022042020/5e772cf765911048b2362a27/html5/thumbnails/13.jpg)
3 Faces of CAPCP (Consistency and Partition Tolerance)
• Uses shards to scale
• Data is consistent, but still run the risk of some data becoming unavailable if a node fails
![Page 14: NoSQL - West Virginia Universityjharner/courses/stat623/docs/NoSQL.pdf · 2014-11-10 · Selecting a NoSQL Database • Key-value is used for session information, preferences, profiles,](https://reader033.fdocuments.in/reader033/viewer/2022042020/5e772cf765911048b2362a27/html5/thumbnails/14.jpg)
3 Faces of CAPAP (Availability and Partition Tolerance)
• System always available
• May return inaccurate data
• Similar to DNS
![Page 15: NoSQL - West Virginia Universityjharner/courses/stat623/docs/NoSQL.pdf · 2014-11-10 · Selecting a NoSQL Database • Key-value is used for session information, preferences, profiles,](https://reader033.fdocuments.in/reader033/viewer/2022042020/5e772cf765911048b2362a27/html5/thumbnails/15.jpg)
NoSQL Database Types• Key-Value
• Document
• Column-family
• Graph
![Page 16: NoSQL - West Virginia Universityjharner/courses/stat623/docs/NoSQL.pdf · 2014-11-10 · Selecting a NoSQL Database • Key-value is used for session information, preferences, profiles,](https://reader033.fdocuments.in/reader033/viewer/2022042020/5e772cf765911048b2362a27/html5/thumbnails/16.jpg)
Key-Value Databases• Simplest API: get, put, delete
• Data is just a blob
• Might not be persistent
• Examples: riak, memcached, redis
![Page 17: NoSQL - West Virginia Universityjharner/courses/stat623/docs/NoSQL.pdf · 2014-11-10 · Selecting a NoSQL Database • Key-value is used for session information, preferences, profiles,](https://reader033.fdocuments.in/reader033/viewer/2022042020/5e772cf765911048b2362a27/html5/thumbnails/17.jpg)
Document Databases• Similar to key-value except the value is in a
known format and is examinable
• Data is structured (most likely JSON, BSON, or XML)
• Might not be persistent
• Examples: CouchDB, MongoDB
![Page 18: NoSQL - West Virginia Universityjharner/courses/stat623/docs/NoSQL.pdf · 2014-11-10 · Selecting a NoSQL Database • Key-value is used for session information, preferences, profiles,](https://reader033.fdocuments.in/reader033/viewer/2022042020/5e772cf765911048b2362a27/html5/thumbnails/18.jpg)
Column Family Databases• Store rows that have many columns associated
with a row key
• Column families are groups of related data that is often accessed together (a customer’s profile would be one family, their orders another)
• Examples: cassandra, HBase
![Page 19: NoSQL - West Virginia Universityjharner/courses/stat623/docs/NoSQL.pdf · 2014-11-10 · Selecting a NoSQL Database • Key-value is used for session information, preferences, profiles,](https://reader033.fdocuments.in/reader033/viewer/2022042020/5e772cf765911048b2362a27/html5/thumbnails/19.jpg)
Graph Databases• Stores entities and the relationships between
them
• Entities have properties
• Edges (relations) can have properties and have directional significance
• Allows easy traversal of relationships
• Examples: Neo4J, Infinite Graph
![Page 20: NoSQL - West Virginia Universityjharner/courses/stat623/docs/NoSQL.pdf · 2014-11-10 · Selecting a NoSQL Database • Key-value is used for session information, preferences, profiles,](https://reader033.fdocuments.in/reader033/viewer/2022042020/5e772cf765911048b2362a27/html5/thumbnails/20.jpg)
Why Choose NoSQL?• To improve programmer productivity
• To improve data access performance by handling larger data sets, reducing latency, and/or improving throughput
![Page 21: NoSQL - West Virginia Universityjharner/courses/stat623/docs/NoSQL.pdf · 2014-11-10 · Selecting a NoSQL Database • Key-value is used for session information, preferences, profiles,](https://reader033.fdocuments.in/reader033/viewer/2022042020/5e772cf765911048b2362a27/html5/thumbnails/21.jpg)
Selecting a NoSQL Database• Key-value is used for session information,
preferences, profiles, and shopping carts. Normally data without relationships.
• Document are for content mangement, real-time analytics, and ecommerce. Avoid when aggregate data is often needed.
• Column family databases are useful for write-heavy operations like logging.
![Page 22: NoSQL - West Virginia Universityjharner/courses/stat623/docs/NoSQL.pdf · 2014-11-10 · Selecting a NoSQL Database • Key-value is used for session information, preferences, profiles,](https://reader033.fdocuments.in/reader033/viewer/2022042020/5e772cf765911048b2362a27/html5/thumbnails/22.jpg)
Selecting a NoSQL Database• Graph databases are optimal for social
networks, spatial data, and recommendation engines