NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have...
Transcript of NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have...
![Page 3: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/3.jpg)
Where Are We?
2 / 90
![Page 4: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/4.jpg)
Database and Database Management System
I Database: an organized collection of data.
I Database Management System (DBMS): a software that interacts with users, otherapplications, and the database itself to capture and analyze data.
3 / 90
![Page 5: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/5.jpg)
Relational Databases Management Systems (RDMBSs)
I RDMBSs: the dominant technology for storing structured data in web and businessapplications.
I SQL is good• Rich language and toolset• Easy to use and integrate• Many vendors
I They promise: ACID
4 / 90
![Page 6: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/6.jpg)
ACID Properties
I Atomicity• All included statements in a transaction are either executed or the whole transaction is
aborted without affecting the database.
I Consistency• A database is in a consistent state before and after a transaction.
I Isolation• Transactions can not see uncommitted changes in the database.
I Durability• Changes are written to a disk before a database commits a transaction so that committed
data cannot be lost through a power failure.
5 / 90
![Page 7: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/7.jpg)
ACID Properties
I Atomicity• All included statements in a transaction are either executed or the whole transaction is
aborted without affecting the database.
I Consistency• A database is in a consistent state before and after a transaction.
I Isolation• Transactions can not see uncommitted changes in the database.
I Durability• Changes are written to a disk before a database commits a transaction so that committed
data cannot be lost through a power failure.
5 / 90
![Page 8: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/8.jpg)
ACID Properties
I Atomicity• All included statements in a transaction are either executed or the whole transaction is
aborted without affecting the database.
I Consistency• A database is in a consistent state before and after a transaction.
I Isolation• Transactions can not see uncommitted changes in the database.
I Durability• Changes are written to a disk before a database commits a transaction so that committed
data cannot be lost through a power failure.
5 / 90
![Page 9: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/9.jpg)
ACID Properties
I Atomicity• All included statements in a transaction are either executed or the whole transaction is
aborted without affecting the database.
I Consistency• A database is in a consistent state before and after a transaction.
I Isolation• Transactions can not see uncommitted changes in the database.
I Durability• Changes are written to a disk before a database commits a transaction so that committed
data cannot be lost through a power failure.
5 / 90
![Page 10: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/10.jpg)
RDBMS Challenges
I Web-based applications caused spikes.• Internet-scale data size• High read-write rates• Frequent schema changes
6 / 90
![Page 11: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/11.jpg)
Let’s Scale RDBMSs
I RDBMS were not designed to be distributed.
I Possible solutions:• Replication• Sharding
7 / 90
![Page 12: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/12.jpg)
Let’s Scale RDBMSs - Replication
I Master/Slave architecture
I Scales read operations
8 / 90
![Page 13: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/13.jpg)
Let’s Scale RDBMSs - Sharding
I Dividing the database across many machines.
I It scales read and write operations.
I Cannot execute transactions across shards (partitions).
9 / 90
![Page 14: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/14.jpg)
Scaling RDBMSs is Expensive and Inefficient
[http://www.couchbase.com/sites/default/files/uploads/all/whitepapers/NoSQLWhitepaper.pdf]
10 / 90
![Page 15: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/15.jpg)
11 / 90
![Page 16: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/16.jpg)
NoSQL
I Avoids:• Overhead of ACID properties• Complexity of SQL query
I Provides:• Scalablity• Easy and frequent changes to DB• Large data volumes
12 / 90
![Page 17: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/17.jpg)
NoSQL Cost and Performance
[http://www.couchbase.com/sites/default/files/uploads/all/whitepapers/NoSQLWhitepaper.pdf]
13 / 90
![Page 18: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/18.jpg)
RDBMS vs. NoSQL
[http://www.couchbase.com/sites/default/files/uploads/all/whitepapers/NoSQLWhitepaper.pdf]
14 / 90
![Page 19: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/19.jpg)
NoSQL Data Models
15 / 90
![Page 20: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/20.jpg)
NoSQL Data Models
[http://highlyscalable.wordpress.com/2012/03/01/nosql-data-modeling-techniques]
16 / 90
![Page 21: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/21.jpg)
Key-Value Data Model
I Collection of key/value pairs.
I Ordered Key-Value: processing over key ranges.
I Dynamo, Scalaris, Voldemort, Riak, ...
17 / 90
![Page 22: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/22.jpg)
Column-Oriented Data Model
I Similar to a key/value store, but the value can have multiple attributes (Columns).
I Column: a set of data values of a particular type.
I Store and process data by column instead of row.
I BigTable, Hbase, Cassandra, ...
18 / 90
![Page 23: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/23.jpg)
Document Data Model
I Similar to a column-oriented store, but values can have complex documents.
I Flexible schema (XML, YAML, JSON, and BSON).
I CouchDB, MongoDB, ...
{
FirstName: "Bob",
Address: "5 Oak St.",
Hobby: "sailing"
}
{
FirstName: "Jonathan",
Address: "15 Wanamassa Point Road",
Children: [
{Name: "Michael", Age: 10},
{Name: "Jennifer", Age: 8},
]
}
19 / 90
![Page 24: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/24.jpg)
Graph Data Model
I Uses graph structures with nodes, edges, and properties to represent and store data.
I Neo4J, InfoGrid, ...
[http://en.wikipedia.org/wiki/Graph database]
20 / 90
![Page 25: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/25.jpg)
Consistency
21 / 90
![Page 26: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/26.jpg)
Consistency
I Strong consistency• After an update completes, any subsequent access will return the updated value.
I Eventual consistency• Does not guarantee that subsequent accesses will return the updated value.• Inconsistency window.• If no new updates are made to the object, eventually all accesses will return the last
updated value.
22 / 90
![Page 27: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/27.jpg)
Consistency
I Strong consistency• After an update completes, any subsequent access will return the updated value.
I Eventual consistency• Does not guarantee that subsequent accesses will return the updated value.• Inconsistency window.• If no new updates are made to the object, eventually all accesses will return the last
updated value.
22 / 90
![Page 28: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/28.jpg)
Quorum Model
I N: the number of nodes to which a data item is replicated.
I R: the number of nodes a value has to be read from to be accepted.
I W: the number of nodes a new value has to be written to before the write operationis finished.
I To enforce strong consistency: R + W > N
23 / 90
![Page 29: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/29.jpg)
Quorum Model
I N: the number of nodes to which a data item is replicated.
I R: the number of nodes a value has to be read from to be accepted.
I W: the number of nodes a new value has to be written to before the write operationis finished.
I To enforce strong consistency: R + W > N
23 / 90
![Page 30: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/30.jpg)
CAP Theorem
I Consistency• Consistent state of data after the execution of an operation.
I Availability• Clients can always read and write data.
I Partition Tolerance• Continue the operation in the presence of network partitions.
I You can choose only two!
24 / 90
![Page 31: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/31.jpg)
Consistency vs. Availability
I The large-scale applications have to be reliable: availability + partition tolerance
I These properties are difficult to achieve with ACID properties.
I The BASE approach forfeits the ACID properties of consistency and isolation in favorof availability and performance.
25 / 90
![Page 32: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/32.jpg)
BASE Properties
I Basic Availability• Possibilities of faults but not a fault of the whole system.
I Soft-state• Copies of a data item may be inconsistent
I Eventually consistent• Copies becomes consistent at some later time if there are no more updates to that data
item
26 / 90
![Page 33: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/33.jpg)
27 / 90
![Page 34: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/34.jpg)
Dyanmo
28 / 90
![Page 35: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/35.jpg)
Dynamo
I Distributed key/value storage system
I Scalable and Highly available
I CAP: it sacrifices strong consistency for availability: always writable
29 / 90
![Page 36: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/36.jpg)
Data Model
30 / 90
![Page 37: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/37.jpg)
Data Model
[http://highlyscalable.wordpress.com/2012/03/01/nosql-data-modeling-techniques]
31 / 90
![Page 38: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/38.jpg)
Partitioning
I Key/value, where values are stored as objects.
I If size of data exceeds the capacity of a single machine: partitioning
I Consistent hashing is one form of sharding (partitioning).
32 / 90
![Page 39: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/39.jpg)
Partitioning
I Key/value, where values are stored as objects.
I If size of data exceeds the capacity of a single machine: partitioning
I Consistent hashing is one form of sharding (partitioning).
32 / 90
![Page 40: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/40.jpg)
Consistent Hashing
I Hash both data and nodes using the same hash function in a same id space.
I partition = hash(d) mod n, d: data, n: number of nodes
hash("Fatemeh") = 12
hash("Ahmad") = 2
hash("Seif") = 9
hash("Jim") = 14
hash("Sverker") = 4
33 / 90
![Page 41: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/41.jpg)
Consistent Hashing
I Hash both data and nodes using the same hash function in a same id space.
I partition = hash(d) mod n, d: data, n: number of nodes
hash("Fatemeh") = 12
hash("Ahmad") = 2
hash("Seif") = 9
hash("Jim") = 14
hash("Sverker") = 4
33 / 90
![Page 42: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/42.jpg)
Replication
I To achieve high availability and durability, data should be replicates on multiplenodes.
34 / 90
![Page 43: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/43.jpg)
Data Consistency
35 / 90
![Page 44: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/44.jpg)
Data Consistency
I Eventual consistency: updates are propagated asynchronously.
I Each update/modification of an item results in a new and immutable version of thedata.
• Multiple versions of an object may exist.
I Replicas eventually become consistent.
36 / 90
![Page 45: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/45.jpg)
Data Versioning (1/2)
I Use vector clocks for capturing causality, in the form of (node, counter)• If causal: older version can be forgotten• If concurrent: conflict exists, requiring reconciliation
I Version branching can happen due to node/network failures.
37 / 90
![Page 46: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/46.jpg)
Data Versioning (1/2)
I Use vector clocks for capturing causality, in the form of (node, counter)• If causal: older version can be forgotten• If concurrent: conflict exists, requiring reconciliation
I Version branching can happen due to node/network failures.
37 / 90
![Page 47: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/47.jpg)
Data Versioning (2/2)
I Client C1 writes new object via Sx.
I C1 updates the object via Sx.
I C1 updates the object via Sy.
I C2 reads D2 and updates theobject via Sz.
I C3 reads D3 and D4 via Sx.• The read context is a
summary of the clocks of D3and D4: [(Sx, 2), (Sy, 1), (Sz, 1)].
I Reconciliation
38 / 90
![Page 48: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/48.jpg)
Data Versioning (2/2)
I Client C1 writes new object via Sx.
I C1 updates the object via Sx.
I C1 updates the object via Sy.
I C2 reads D2 and updates theobject via Sz.
I C3 reads D3 and D4 via Sx.• The read context is a
summary of the clocks of D3and D4: [(Sx, 2), (Sy, 1), (Sz, 1)].
I Reconciliation
38 / 90
![Page 49: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/49.jpg)
Data Versioning (2/2)
I Client C1 writes new object via Sx.
I C1 updates the object via Sx.
I C1 updates the object via Sy.
I C2 reads D2 and updates theobject via Sz.
I C3 reads D3 and D4 via Sx.• The read context is a
summary of the clocks of D3and D4: [(Sx, 2), (Sy, 1), (Sz, 1)].
I Reconciliation
38 / 90
![Page 50: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/50.jpg)
Data Versioning (2/2)
I Client C1 writes new object via Sx.
I C1 updates the object via Sx.
I C1 updates the object via Sy.
I C2 reads D2 and updates theobject via Sz.
I C3 reads D3 and D4 via Sx.• The read context is a
summary of the clocks of D3and D4: [(Sx, 2), (Sy, 1), (Sz, 1)].
I Reconciliation
38 / 90
![Page 51: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/51.jpg)
Data Versioning (2/2)
I Client C1 writes new object via Sx.
I C1 updates the object via Sx.
I C1 updates the object via Sy.
I C2 reads D2 and updates theobject via Sz.
I C3 reads D3 and D4 via Sx.• The read context is a
summary of the clocks of D3and D4: [(Sx, 2), (Sy, 1), (Sz, 1)].
I Reconciliation
38 / 90
![Page 52: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/52.jpg)
Data Versioning (2/2)
I Client C1 writes new object via Sx.
I C1 updates the object via Sx.
I C1 updates the object via Sy.
I C2 reads D2 and updates theobject via Sz.
I C3 reads D3 and D4 via Sx.• The read context is a
summary of the clocks of D3and D4: [(Sx, 2), (Sy, 1), (Sz, 1)].
I Reconciliation
38 / 90
![Page 53: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/53.jpg)
Dynamo API
39 / 90
![Page 54: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/54.jpg)
Dynamo API
I get(key)• Return single object or list of objects with conflicting version and context.
I put(key, context, object)• Store object and context under key.• Context encodes system metadata, e.g., version number.
40 / 90
![Page 55: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/55.jpg)
put Operation
I Coordinator generates new vector clock and writes the new version locally.
I Send to N nodes.
I Wait for response from W nodes.
41 / 90
![Page 56: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/56.jpg)
get Operation
I Coordinator requests existing versions from N.• Wait for response from R nodes.
I If multiple versions, return all versions that are causally unrelated.
I Divergent versions are then reconciled.
I Reconciled version written back.
42 / 90
![Page 57: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/57.jpg)
Membership Management
43 / 90
![Page 58: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/58.jpg)
Membership Management
I Administrator explicitly adds and removes nodes.
I Gossiping to propagate membership changes.• Eventually consistent view.• O(1) hop overlay.
44 / 90
![Page 59: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/59.jpg)
Adding and Removing Nodes
I A new node X added to system.• X is assigned key ranges w.r.t. its virtual servers.• For each key range, it transfers the data items.
I Removing a node: reallocation of keys is a reverse process of adding nodes.
45 / 90
![Page 60: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/60.jpg)
Adding and Removing Nodes
I A new node X added to system.• X is assigned key ranges w.r.t. its virtual servers.• For each key range, it transfers the data items.
I Removing a node: reallocation of keys is a reverse process of adding nodes.
45 / 90
![Page 61: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/61.jpg)
Failure Detection
I Passive failure detection.• Use pings only for detection from failed to alive.
I In the absence of client requests, node A doesn’t need to know if node B is alive.
46 / 90
![Page 62: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/62.jpg)
BigTable
47 / 90
![Page 63: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/63.jpg)
Motivation
I Lots of (semi-)structured data at Google.• URLs, TextGreenper-user data, geographical locations, ...
I Big data• Billions of URLs, hundreds of millions of users, 100+TB of satellite image data, ...
48 / 90
![Page 64: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/64.jpg)
BigTable
I Distributed multi-level map
I Fault-tolerant
I Scalable and self-managing
I CAP: strong consistency and partition tolerance
49 / 90
![Page 65: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/65.jpg)
Data Model
50 / 90
![Page 66: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/66.jpg)
Data Model
[http://highlyscalable.wordpress.com/2012/03/01/nosql-data-modeling-techniques]
51 / 90
![Page 67: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/67.jpg)
Column-Oriented Data Model (1/2)
I Similar to a key/value store, but the value can have multiple attributes (Columns).
I Column: a set of data values of a particular type.
I Store and process data by column instead of row.
52 / 90
![Page 68: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/68.jpg)
Columns-Oriented Data Model (2/2)
I In many analytical databases queries, few attributes are needed.
I Column values are stored contiguously on disk: reduces I/O.
[Lars George, “Hbase: The Definitive Guide”, O’Reilly, 2011]
53 / 90
![Page 69: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/69.jpg)
BigTable Data Model (1/5)
I Table
I Distributed multi-dimensional sparse map
54 / 90
![Page 70: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/70.jpg)
BigTable Data Model (2/5)
I Rows
I Every read or write in a row is atomic.
I Rows sorted in lexicographical order.
55 / 90
![Page 71: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/71.jpg)
BigTable Data Model (3/5)
I Column
I The basic unit of data access.
I Column families: group of (the same type) column keys.
I Column key naming: family:qualifier
56 / 90
![Page 72: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/72.jpg)
BigTable Data Model (4/5)
I Timestamp
I Each column value may contain multiple versions.
57 / 90
![Page 73: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/73.jpg)
BigTable Data Model (5/5)
I Tablet: contiguous ranges of rows stored together.
I Tables are split by the system when they become too large.
I Each tablet is served by exactly one tablet server.
58 / 90
![Page 74: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/74.jpg)
BigTable Architecture
59 / 90
![Page 75: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/75.jpg)
BigTable Cell
60 / 90
![Page 76: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/76.jpg)
Main Components
I Master server
I Tablet server
I Client library
61 / 90
![Page 77: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/77.jpg)
Master Server
I Assigns tablets to tablet server.
I Balances tablet server load.
I Garbage collection of unneeded files in GFS.
I Handles schema changes, e.g., table and column family creations
62 / 90
![Page 78: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/78.jpg)
Tablet Server
I Can be added or removed dynamically.
I Each manages a set of tablets (typically 10-1000 tablets/server).
I Handles read/write requests to tablets.
I Splits tablets when too large.
63 / 90
![Page 79: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/79.jpg)
Client Library
I Library that is linked into every client.
I Client data does not move though the master.
I Clients communicate directly with tablet servers for reads/writes.
64 / 90
![Page 80: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/80.jpg)
Building Blocks
I The building blocks for the BigTable are:• Google File System (GFS): raw storage• Chubby: distributed lock manager• Scheduler: schedules jobs onto machines
65 / 90
![Page 81: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/81.jpg)
Google File System (GFS)
I Large-scale distributed file system.
I Store log and data files.
66 / 90
![Page 82: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/82.jpg)
Chubby Lock Service
I Ensure there is only one active master.
I Store bootstrap location of BigTable data.
I Discover tablet servers.
I Store BigTable schema information.
I Store access control lists.
67 / 90
![Page 83: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/83.jpg)
Table Serving
68 / 90
![Page 84: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/84.jpg)
Master Startup
I The master executes the following steps at startup:
• Grabs a unique master lock in Chubby, which prevents concurrent master instantiations.
• Scans the servers directory in Chubby to find the live servers.
• Communicates with every live tablet server to discover what tablets are already assignedto each server.
• Scans the METADATA table to learn the set of tablets.
69 / 90
![Page 85: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/85.jpg)
Master Startup
I The master executes the following steps at startup:
• Grabs a unique master lock in Chubby, which prevents concurrent master instantiations.
• Scans the servers directory in Chubby to find the live servers.
• Communicates with every live tablet server to discover what tablets are already assignedto each server.
• Scans the METADATA table to learn the set of tablets.
69 / 90
![Page 86: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/86.jpg)
Master Startup
I The master executes the following steps at startup:
• Grabs a unique master lock in Chubby, which prevents concurrent master instantiations.
• Scans the servers directory in Chubby to find the live servers.
• Communicates with every live tablet server to discover what tablets are already assignedto each server.
• Scans the METADATA table to learn the set of tablets.
69 / 90
![Page 87: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/87.jpg)
Master Startup
I The master executes the following steps at startup:
• Grabs a unique master lock in Chubby, which prevents concurrent master instantiations.
• Scans the servers directory in Chubby to find the live servers.
• Communicates with every live tablet server to discover what tablets are already assignedto each server.
• Scans the METADATA table to learn the set of tablets.
69 / 90
![Page 88: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/88.jpg)
Master Startup
I The master executes the following steps at startup:
• Grabs a unique master lock in Chubby, which prevents concurrent master instantiations.
• Scans the servers directory in Chubby to find the live servers.
• Communicates with every live tablet server to discover what tablets are already assignedto each server.
• Scans the METADATA table to learn the set of tablets.
69 / 90
![Page 89: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/89.jpg)
Tablet Assignment
I 1 tablet → 1 tablet server.
I Master uses Chubby to keep tracks of live tablet serves and unassigned tablets.• When a tablet server starts, it creates and acquires an exclusive lock in Chubby.
I Master detects the status of the lock of each tablet server by checking periodically.
I Master is responsible for finding when tablet server is no longer serving its tabletsand reassigning those tablets as soon as possible.
70 / 90
![Page 90: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/90.jpg)
Tablet Assignment
I 1 tablet → 1 tablet server.
I Master uses Chubby to keep tracks of live tablet serves and unassigned tablets.• When a tablet server starts, it creates and acquires an exclusive lock in Chubby.
I Master detects the status of the lock of each tablet server by checking periodically.
I Master is responsible for finding when tablet server is no longer serving its tabletsand reassigning those tablets as soon as possible.
70 / 90
![Page 91: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/91.jpg)
Tablet Assignment
I 1 tablet → 1 tablet server.
I Master uses Chubby to keep tracks of live tablet serves and unassigned tablets.• When a tablet server starts, it creates and acquires an exclusive lock in Chubby.
I Master detects the status of the lock of each tablet server by checking periodically.
I Master is responsible for finding when tablet server is no longer serving its tabletsand reassigning those tablets as soon as possible.
70 / 90
![Page 92: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/92.jpg)
Tablet Assignment
I 1 tablet → 1 tablet server.
I Master uses Chubby to keep tracks of live tablet serves and unassigned tablets.• When a tablet server starts, it creates and acquires an exclusive lock in Chubby.
I Master detects the status of the lock of each tablet server by checking periodically.
I Master is responsible for finding when tablet server is no longer serving its tabletsand reassigning those tablets as soon as possible.
70 / 90
![Page 93: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/93.jpg)
Finding a Tablet
I Three-level hierarchy.
I The first level is a file stored in Chubby that contains the location of the root tablet.
I Root tablet contains location of all tablets in a special METADATA table.
I METADATA table contains location of each tablet under a row.
I The client library caches tablet locations.
71 / 90
![Page 94: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/94.jpg)
SSTable (1/2)
I SSTable file format used internally to store Bigtable data.
I Immutable, sorted file of key-value pairs.
I Each SSTable is stored in a GFS file.
72 / 90
![Page 95: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/95.jpg)
SSTable (2/2)
I Chunks of data plus a block index.• A block index is used to locate blocks.• The index is loaded into memory when the SSTable is opened.
73 / 90
![Page 96: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/96.jpg)
Tablet Serving (1/2)
I Updates committed to a commit log.
I Recently committed updates are stored in memory - memtable
I Older updates are stored in a sequence of SSTables.
74 / 90
![Page 97: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/97.jpg)
Tablet Serving (2/2)
I Strong consistency• Only one tablet server is responsible for a given piece of data.• Replication is handled on the GFS layer.
I Trade-off with availability• If a tablet server fails, its portion of data is temporarily unavailable until a new server
is assigned.
75 / 90
![Page 98: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/98.jpg)
Tablet Serving (2/2)
I Strong consistency• Only one tablet server is responsible for a given piece of data.• Replication is handled on the GFS layer.
I Trade-off with availability• If a tablet server fails, its portion of data is temporarily unavailable until a new server
is assigned.
75 / 90
![Page 99: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/99.jpg)
Loading Tablets
I To load a tablet, a tablet server does the following:
I Finds locaton of tablet through its METADATA.• Metadata for a tablet includes list of SSTables and set of redo points.
I Read SSTables index blocks into memory.
I Read the commit log since the redo point and reconstructs the memtable.
76 / 90
![Page 100: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/100.jpg)
Compaction
I Minor compaction• Convert the memtable into an SSTable.
I Merging compaction• Reads the contents of a few SSTables and the memtable, and writes out a new
SSTable.
I Major compaction• Merging compaction that results in only one SSTable.• No deleted records, only sensitive live data.
77 / 90
![Page 101: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/101.jpg)
Compaction
I Minor compaction• Convert the memtable into an SSTable.
I Merging compaction• Reads the contents of a few SSTables and the memtable, and writes out a new
SSTable.
I Major compaction• Merging compaction that results in only one SSTable.• No deleted records, only sensitive live data.
77 / 90
![Page 102: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/102.jpg)
Compaction
I Minor compaction• Convert the memtable into an SSTable.
I Merging compaction• Reads the contents of a few SSTables and the memtable, and writes out a new
SSTable.
I Major compaction• Merging compaction that results in only one SSTable.• No deleted records, only sensitive live data.
77 / 90
![Page 103: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/103.jpg)
BigTable vs. HBase
BigTable HBase
GFS HDFS
Tablet Server Region Server
SSTable StoreFile
Memtable MemStore
Chubby ZooKeeper
78 / 90
![Page 104: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/104.jpg)
HBase Example
# Create the table "test", with the column family "cf"
create ’test’, ’cf’
# Use describe to get the description of the "test" table
describe ’test’
# Put data in the "test" table
put ’test’, ’row1’, ’cf:a’, ’value1’
put ’test’, ’row2’, ’cf:b’, ’value2’
put ’test’, ’row3’, ’cf:c’, ’value3’
# Scan the table for all data at once
scan ’test’
# To get a single row of data at a time, use the get command
get ’test’, ’row1’
79 / 90
![Page 105: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/105.jpg)
Cassandra
80 / 90
![Page 106: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/106.jpg)
Cassandra
81 / 90
![Page 107: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/107.jpg)
From Dynamo
I Symmetric P2P architecture
I Gossip based discovery and error detection
I Distributed key-value store: partitioning and topology discovery
I Eventual consistency
82 / 90
![Page 108: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/108.jpg)
From BigTable
I Sparse Column oriented sparse array
I SSTable disk storage• Append-only commit log• Memtable (buffering and sorting)• Immutable sstable files• Compaction
83 / 90
![Page 109: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/109.jpg)
Cassandra Example
# Create a keyspace called "test"
# (a keyspace is similar to a database in the RDBMS)
create keyspace test
with replication = {’class’: ’SimpleStrategy’, ’replication_factor’: 1};
# Print the list of keyspaces
describe keyspaces;
# Navigate to the "test" keyspace
use test
# Create the "words" table in the "test" keyspace
create table words (word text, count int, primary key (word));
# Insert a row
insert into words(word, count) values(’hello’, 5);
# Look at the table
select * from words;
84 / 90
![Page 110: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/110.jpg)
Summary
85 / 90
![Page 111: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/111.jpg)
Summary
I NoSQL data models: key-value, column-oriented, document-oriented, graph-based
I Sharding and consistent hashing
I ACID vs. BASE
I CAP (Consistency vs. Availability)
86 / 90
![Page 112: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/112.jpg)
Summary
I Dynamo: key/value storage: put and get
I Data partitioning: consistent hashing
I Replication: several nodes, preference list
I Data versioning: vector clock, resolve conflict at read time by the application
I Membership management: join/leave by admin, gossip-based to update the nodes’views, ping to detect failure
87 / 90
![Page 113: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/113.jpg)
Summary
I BigTable
I Column-oriented
I Main components: master, tablet server, client library
I Basic components: GFS, SSTable, Chubby
88 / 90
![Page 114: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/114.jpg)
References
I G. DeCandia et al., Dynamo: amazon’s highly available key-value store, ACMSIGOPS operating systems review. Vol. 41. No. 6. ACM, 2007.
I F. Chang et al., Bigtable: A distributed storage system for structured data, ACMTransactions on Computer Systems (TOCS) 26.2, 2008.
I A. Lakshman et al., Cassandra: a decentralized structured storage system, ACMSIGOPS Operating Systems Review 44.2, 2010.
89 / 90
![Page 115: NoSQL Databases - GitHub Pages · Consistency vs. Availability I The large-scale applications have to bereliable:availability + partition tolerance I These properties aredi cultto](https://reader034.fdocuments.in/reader034/viewer/2022042115/5e92fb56814ef1219c2370cd/html5/thumbnails/115.jpg)
Questions?
90 / 90