Apache Cassandra, part 1 – principles, data model
-
Upload
andrey-lomakin -
Category
Technology
-
view
10.555 -
download
2
description
Transcript of Apache Cassandra, part 1 – principles, data model
![Page 1: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/1.jpg)
April 9, 2023 www.ExigenServices.com
Apache Cassandra, part 1 – principles, data model
![Page 2: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/2.jpg)
2 www.ExigenServices.com
I. RDBMS Pros and Cons
![Page 3: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/3.jpg)
3 www.ExigenServices.com
Pros
1. Good balance between functionality and usability. Powerful tools support.
2. SQL has feature rich syntax
3. Set of widely accepted standards.
4. Consistency
![Page 4: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/4.jpg)
4 www.ExigenServices.com
Scalability
RDBMS were mainstream for tens years till requirements for scalability were increased dramatically.
Complexity of processed data structures was increased dramatically.
![Page 5: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/5.jpg)
5 www.ExigenServices.com
Scaling
Two ways to achieve scalability:
– Vertical scaling– Horizontal scaling
![Page 6: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/6.jpg)
6 www.ExigenServices.com
CAP Theorem
![Page 7: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/7.jpg)
7 www.ExigenServices.com
Cons
Cost of distributed transactions
a) No availability support . Two DB with 99.9% have availability 100% - 2 * (100% - DB availability) = 99.8% (43 min. downtime per month).
b) Additional synchronization overhead.
c) As slow as slowest DB node + network latency.
d) 2PC is blocking protocol.
e) It is possible to lock resources forever.
![Page 8: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/8.jpg)
8 www.ExigenServices.com
Cons
Usage of master - slave replication.
Makes write side (master) performance bottleneck and requires additional CPU/IO resources.
There is no partition tolerance.
![Page 9: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/9.jpg)
9 www.ExigenServices.com
Sharding
a) Feature sharding
b) Hash code sharding
c) Lookup table - Node that contains lookup table is performance bottleneck and single point of failure.
![Page 10: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/10.jpg)
10 www.ExigenServices.com
Feature sharding
DB instances are divided by DB functions.
![Page 11: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/11.jpg)
11 www.ExigenServices.com
Hash code sharding
Data is divided through DB instances by hash code ranges.
![Page 12: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/12.jpg)
12 www.ExigenServices.com
Sharding consistency
For efficient sharding data should be eventually consistent.
![Page 13: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/13.jpg)
13 www.ExigenServices.com
Feature vs. hash code sharding
Feature sharding allows to perform consistency tuning on the domain logic granularity. But load may be not well balanced.
Hash code sharding allows to perform good load balancing but does not allow consistency on domain logic level.
![Page 14: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/14.jpg)
14 www.ExigenServices.com
Cassandra sharding
Cassandra uses hash code load balancing
Cassandra better fits for reporting than for business logic processing.
Cassandra + Hadoop == OLAP server with high performance and availability.
![Page 15: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/15.jpg)
15 www.ExigenServices.com
II. Apache Cassandra. Overview
![Page 16: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/16.jpg)
16 www.ExigenServices.com
Cassandra
Amazon Dynamo
(architecture)
DHT Eventual consistency Tunable trade-offs,
consistency
Google BigTable(data model)
Values are structured and indexed
Column families and columns
+
![Page 17: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/17.jpg)
17 www.ExigenServices.com
Distributed and decentralized
No master/slave nodes (server symmetry) No single point of failure
![Page 18: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/18.jpg)
18 www.ExigenServices.com
DHT
Distributed hash table (DHT) is a class of a decentralized distributed system that provides a lookup service similar to a hash table; (key, value) pairs are stored in a DHT, and any participating node can efficiently retrieve the value associated with a given key.
![Page 19: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/19.jpg)
19 www.ExigenServices.com
DHT
Keyspace Keyspace partitioning Overlay network
![Page 20: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/20.jpg)
20 www.ExigenServices.com
Keyspace
Abstract keyspace, such as the set of 128 or 160 bit strings.
A keyspace partitioning scheme splits ownership of this keyspace among the participating nodes.
![Page 21: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/21.jpg)
21 www.ExigenServices.com
Keyspace partitioning
Keyspace distance function δ(k1,k2)
A node with ID ix owns all the keys km for which ix is the closest ID, measured according to δ(km,ix).
![Page 22: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/22.jpg)
22 www.ExigenServices.com
Keyspace partitioning
Imagine mapping range from 0 to 2128 into a circle so the values wrap around.
![Page 23: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/23.jpg)
23 www.ExigenServices.com
Keyspace partitioning
Consider what happens if node C is removed
![Page 24: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/24.jpg)
24 www.ExigenServices.com
Keyspace partitioning
Consider what happens if node D is added
![Page 25: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/25.jpg)
25 www.ExigenServices.com
Overlay network
For any key k, each node either has a node ID that owns k or has a link to a node whose node ID is closer to k
Greedy algorithm (that is not necessarily globally optimal): at each step, forward the message to the neighbor whose ID is closest to k
![Page 26: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/26.jpg)
26 www.ExigenServices.com
Elastic scalability
Adding/removing new node doesn’t require reconfiguring of Cassandra, changing application queries or restarting system
![Page 27: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/27.jpg)
27 www.ExigenServices.com
High availability and fault tolerance
Cassandra picks A and P from CAP Eventual consistency
![Page 28: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/28.jpg)
28 www.ExigenServices.com
Tunable consistency
Replication factor (number of copies of each piece of data)
Consistency level (number of replicas to access on every read/write operation)
Consistency level Read / Write
ONE 1 replica
QUORUM N/2 + 1
ALL N
![Page 29: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/29.jpg)
29 www.ExigenServices.com
Quorum consistency level
R = N/2 + 1W = N/2 + 1R + W > N
![Page 30: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/30.jpg)
30 www.ExigenServices.com
Hybrid orientation
Column orientation– columns aren’t fixed– columns can be sorted– columns can be queried for a certain range
Row orientation– each row is uniquely identifiable by key– rows group columns and super columns
![Page 31: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/31.jpg)
31 www.ExigenServices.com
Schema-free
You don’t have to define columns when you create data model
You think of queries you will use and then provide data around them
![Page 32: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/32.jpg)
33 www.ExigenServices.com
III. Data Model
![Page 33: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/33.jpg)
34 www.ExigenServices.com
Table1 Table2
Database
Relational data model
Column1 Column2
Row1 value value
Row2 null value
…
Column1 Column2 Column3
Row1 value value value
Row2 null value null
…
![Page 34: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/34.jpg)
35 www.ExigenServices.com
Cassandra data model
Keyspace
Column Family
RowKey1
RowKey2
Column1 Column2 Column3
Value3Value2Value1
Value4Value1
Column4Column1
![Page 35: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/35.jpg)
36 www.ExigenServices.com
Keyspace
Keyspace is close to a relational database
Basic attributes:– replication factor– replica placement strategy– column families (tables from relational model)
Possible to create several keyspaces per application (for example, if you need different replica placement strategy or replication factor)
![Page 36: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/36.jpg)
37 www.ExigenServices.com
Column family
Container for collection of rows Column family is close to a table from relational
data model
Column Family
Row
RowKeyColumn1 Column2 Column3
Value3Value2Value1
![Page 37: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/37.jpg)
38 www.ExigenServices.com
Column family vs. Table
Store represents four-dimensional hash map[Keyspace][ColumnFamily][Key][Column]
The columns are not strictly defined in column family and you can freely add any column to any row at any time
A column family can hold columns or super columns (collection of subcolumns)
![Page 38: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/38.jpg)
39 www.ExigenServices.com
Column family vs. Table
Column family has an comparator attribute which indicated how columns will be sorted in query results (according to long, byte, UTF8, etc)
Each column family is stored in separate file on disk so it’s useful to keep related columns in the same column family
![Page 39: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/39.jpg)
40 www.ExigenServices.com
Column
Basic unit of data structure
Column
name: byte[] value: byte[] clock: long
![Page 40: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/40.jpg)
41 www.ExigenServices.com
Skinny and wide rows
Wide rows – huge number of columns and several rows (are used to store lists of things)
Skinny rows – small number of columns and many different rows (close to the relational model)
![Page 41: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/41.jpg)
42 www.ExigenServices.com
Disadvantages of wide rows
Badly work with RowCash
If you have many rows and many columns you end up with larger indexes
(~ 40GB of data and 10GB index)
![Page 42: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/42.jpg)
43 www.ExigenServices.com
Column sorting
Column sorting is typically important only with wide model
Comparator – is an attribute of column family that specifies how column names will be compared for sort order
![Page 43: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/43.jpg)
44 www.ExigenServices.com
Comparator types
Cassandra has following predefined types:– AsciiType– BytesType– LexicalUUIDType– IntegerType– LongType– TimeUUIDType– UTF8Type
![Page 44: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/44.jpg)
45 www.ExigenServices.com
Super column
Super column
name: byte[] cols: Map<byte[], Column>
• Cannot store map of super columns (only one level deep)
• Five-dimensional hash:
[Keyspace][ColumnFamily][Key][SuperColumn][SubColumn]
Stores map of subcolumns
![Page 45: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/45.jpg)
46 www.ExigenServices.com
Super column
Sometimes it is useful to use composite keys instead of super columns.• Necessity more then one level depth• Performance issues
![Page 46: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/46.jpg)
47 www.ExigenServices.com
Super column family
Column families:
– Standard (default)
Can combine columns and super columns
– Super
More strict schema constraints
Can store only super columns
Subcomparator can be specified for subcolumns
![Page 47: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/47.jpg)
48 www.ExigenServices.com
Note that
There are no joins in Cassandra, so you can
– join data on a client side– create denormalized second column family
![Page 48: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/48.jpg)
49 www.ExigenServices.com
IV. Advanced column types
![Page 49: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/49.jpg)
50 www.ExigenServices.com
TTL column type
TTL column is column value of which expires after given period of time.
Useful to store session token.
![Page 50: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/50.jpg)
51 www.ExigenServices.com
Counter column
In eventual consistent environment old versions of column values are overridden by new one, but counters should be cumulative.
Counter columns are intended to support increment/decrement operations in eventual consistent environment without losing any of them.
![Page 51: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/51.jpg)
52 www.ExigenServices.com
CounterColumn internals
CounterColumn structure:
name
…….
[
(replicaId1, counter1, logical clock1),
(replicaId2, counter2, logical clock2),
………………..
(replicaId3, counter3, logical clock3)
]
![Page 52: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/52.jpg)
53 www.ExigenServices.com
CounterColumn write - before
UPDATE CounterCF SET count_me = count_me + 2
WHERE key = 'counter1‘
[
(A, 10, 2),
(B, 3, 4),
(C, 6, 7)
]
![Page 53: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/53.jpg)
54 www.ExigenServices.com
CounterColumn write -after
A is leader
[
(A, 10 + 2, 2 + 1),
(B, 3, 4),
(C, 6, 7)
]
![Page 54: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/54.jpg)
55 www.ExigenServices.com
CounterColumn Read
All Memtables and SSTables are read through using following algorithm:
All tuples with local replicaId will be summarized, tuple with maximum logical clock value will be chosen for foreign replica.
Counters of foreign replicas are updated during read repair , during replicate on write procedure or by AES
![Page 55: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/55.jpg)
56 www.ExigenServices.com
CounterColumn read - example
Memtable - (A, 12, 4) (B, 3, 5) (C, 10, 3) SSTable1 – (A, 5, 3) (B, 1, 6) (C, 5, 4) SSTable2 – (A, 2, 2) (B, 2, 4) (C, 6, 2)
Result:
(A, 19, 9) + (B, 1,6) + (C, 5, 4) =19 + 1 + 5 = 25
![Page 56: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/56.jpg)
57 www.ExigenServices.com
Resources
Home of Apache Cassandra Project http://cassandra.apache.org/
Apache Cassandra Wiki http://wiki.apache.org/cassandra/ Documentation provided by DataStax
http://www.datastax.com/docs/0.8/ Good explanation of creation secondary indexes
http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html
Eben Hewitt “Cassandra: The Definitive Guide”, O’REILLY, 2010, ISBN: 978-1-449-39041-9
![Page 57: Apache Cassandra, part 1 – principles, data model](https://reader033.fdocuments.in/reader033/viewer/2022061206/5481b4a1b4af9fe25f8b45c7/html5/thumbnails/57.jpg)
58 www.ExigenServices.com
Authors
Lev Sivashov - [email protected]
Andrey Lomakin - [email protected], twitter: @Andrey_LomakinLinkedIn: http://www.linkedin.com/in/andreylomakin
Artem Orobets – [email protected]: @Dr_EniSh
Anton Veretennik - [email protected]