Data day texas: Cassandra and the Cloud

75
© DataStax, All Rights Reserved. Cassandra and the Cloud (2018 edition) Jonathan Ellis

Transcript of Data day texas: Cassandra and the Cloud

Page 1: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Cassandra and the Cloud (2018 edition)

Jonathan Ellis

Page 2: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

“Databases”

Page 3: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Stuff everyone agrees on

Page 4: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Stuff (Almost) Everyone Agrees On

1. Eventual Consistency is useful

Page 5: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Eventual Consistency (CP edition)

Page 6: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Eventual Consistency (AP edition)

Page 7: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Default consistency levels

1. Cassandra: Eventual2. Dynamo: Eventual3. CosmosDB: Eventual (“Session”)4. Spanner: ACID (no EC)

Page 8: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Stuff (Almost) Everyone Agrees On

1. Eventual Consistency is useful2. Automatic partitioning doesn’t work

Page 9: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

H-Store (2012)

Page 10: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Partitioning approaches

1. Cassandra: Explicit2. Dynamo: Explicit3. CosmosDB: Explicit4. Spanner: Explicit

Page 11: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Stuff (Almost) Everyone Agrees On

1. Eventual Consistency is useful2. Automatic partitioning doesn’t work3. SQL is a pretty okay query language

Page 12: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Page 13: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Query APIs

1. Cassandra: CQL, inspired by SQL2. DynamoDB: Actually still pretty first-gen NoSQL3. CosmosDB: “SQL”4. Spanner: “SQL”

Page 14: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Stuff (Almost) Everyone Agrees On

1. Eventual Consistency is useful2. Automatic partitioning doesn’t work3. SQL is a pretty okay query language

4. … that’s about it

Page 15: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Thomas Sowell

There are no solutions. Only tradeoffs.

Page 16: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Cassandra

Page 17: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Data Model: tabular, with nested contentCREATE TABLE notifications ( target_user text, notification_id timeuuid, source_id uuid, source_type text, activity text, PRIMARY KEY (target_user, notification_id))WITH CLUSTERING ORDER BY (notification_id DESC);

Page 18: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

target_user notification_id source_id source_type activity

nick e1bd2bcb- d972b679- photo tom liked

nick 321998c- d972b679- photo jake commented

nick ea1c5d35- 88a049d5- user mike created account

nick 5321998c- 64613f27- photo tom commented

nick 07581439- 076eab7e- user tyler created account

mike 1c34467a- f04e309f- user tom created account

Page 19: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

CollectionsCREATE TABLE users ( id uuid PRIMARY KEY, name text, state text, birth_date int, email_addresses set<text>);

Page 20: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

User-defined TypesCREATE TYPE address ( street text, city text, zip_code int, phones set<text>)

CREATE TABLE users ( id uuid PRIMARY KEY, name text, addresses map<text, address>)

SELECT id, name, addresses.city, addresses.phones FROM users;

id | name | addresses.city | addresses.phones--------------------+----------------+-------------------------- 63bf691f | jbellis | Austin | {'512-4567', '512-9999'}

Page 21: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

JSONINSERT INTO users JSON'{"id": "0514e410-", "name": "jbellis", "addresses": {"home": {"street": "9920 Cassandra Ave", "city": "Austin", "zip_code": 78700, "phones": ["1238614789"]}}}';

Page 22: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Without nesting

Page 23: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

With nesting

Page 24: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Consistency Levels

Page 25: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Consistency Levels

Page 26: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Consistency Levels

Page 27: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Multi-region

1. Synchronous writes locally; async globally2. Serve reads and writes for any row in any region3. DR is “free”4. Client-level support

Page 28: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Notable features

● Lightweight Transactions (Paxos, expensive)● Materialized Views● User-defined functions● Strict schema

Page 29: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Page 30: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Page 31: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Schema confusion{"userid": "2452347", "name": "jbellis", ... }

{"userid": 2452348, "name": "jshook", ... }

{"user_id": 2452349, "name": "jlacefield", ... }

Page 32: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

DynamoDB

Page 33: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

CP single partition

● Original Dynamo was AP● DynamoDB offers Strong/Eventual read consistency● But

Conditional writes are the same price as regular writes

And: “All write requests are applied in the order in which they were received”

Page 34: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Data model

● “Map of maps”● Primary key, sort key

Page 35: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Data Model

Page 36: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Multi-region

● New feature (late 2017): Global tables● Shards and replicates a table across regions● Each shard can only be written to by its master region

Page 37: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Notable features

● Global indexes● Change feed● “DynamoDB Transaction Library”

“A put that does not contend with any other simultaneous puts can be expected to perform 7N + 4 writes as the original operation, where N is the number of requests in the transaction.”

Page 38: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Sidebar: cross-partition txns in AP?

Page 39: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

CosmosDB

Page 40: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Data model:CP Single Partition

Page 41: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

“Multi model”

● NOT just “APIs”● More like MyRocks/MongoRocks than C* CQL/JSON● What they have in common:● Hash partitioning● Undefined sorting within partitions; use ORDER BY

Page 42: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Data model

Page 43: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

“Multi model”

● Not all features supported everywhere● Azure Functions● Change Feed● TLDR use SQL/Document API

Page 44: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Page 45: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

SQL support and extensionsSELECT c.givenName FROM Families f JOIN c IN f.children WHERE f.id = 'WakefieldFamily'ORDER BY f.address.city ASC

“The language lets you refer to nodes of the tree at any arbitrary depth, like Node1.Node2.Node3…..NodeM”

Page 46: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Page 47: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Consistency Levels

Page 48: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Consistency Levels

● This is probably still too many(“About 73% of Azure Cosmos DB tenants use session consistency and 20% prefer bounded staleness.”)

Page 49: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Implementation clue?

Page 50: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Multi-region

● Claims local read/writes with async replication between regions

● But, also claims ACID single-partition transactions in stored procedures

● You can’t have both! Something doesn’t add up!

Page 51: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Notable features

● Everything is indexed99p < 20% overhead

● “Attachment” special document type for blobsMain purpose seems to be to allow PUTing data easily

● Stored proceduresIncluding (single-partition) transactionsOnly for SQL API

● Change feed

Page 52: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Spanner

Page 53: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Data model: CP multi-partition

● “Reuse existing SQL skills to query data in Cloud Spanner using familiar, industry-standard ANSI 2011 SQL.”(Actually a fairly small subset)(And only for SELECT)

Page 54: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Interleaved/child tablesCREATE TABLE Singers ( SingerId INT64 NOT NULL, FirstName STRING(1024), LastName STRING(1024), SingerInfo BYTES(MAX),) PRIMARY KEY (SingerId);

CREATE TABLE Albums ( SingerId INT64 NOT NULL, AlbumId INT64 NOT NULL, AlbumTitle STRING(MAX),) PRIMARY KEY (SingerId, AlbumId), INTERLEAVE IN PARENT Singers ON DELETE CASCADE;

Page 55: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Page 56: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Multi-region

Page 57: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Multi-region, TLDR

● You can replicate to multiple regions● Only one region can accept writes at a time● Opinion: it is not often useful to scale reads without also

scaling writes

Page 58: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Notable features

● Full multi-partition ACID 2PC (using Paxos replication groups)

● DFS-based, not local storage

Page 59: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

The price of ACID

Page 60: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Writes slow down (indexed) reads

Quizlet:

“Bulk writes severely impact the performance of queries using the secondary index [becuase] a write with a secondary index updates many splits, which [since Spanner uses pessimistic locking] creates contention for reads that use that secondary index.”

Page 61: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Practical considerations

Page 62: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Cassandra

● Run anywhere you like--but you have to run it○ But: DataStax Managed Cloud, Instaclustr○ Also: DataStax Remote DBA

● Storage closely tied to compute○ But: everyone struggles with this

Page 63: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Multi-cloud

● JP Morgan: “We have seen increasingly all the customers we talk to, almost exclusively large mid-market to large enterprise, all now are embracing multi-cloud as a specific strategy.”

Page 64: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

DynamoDB

● Request capacity tied to “partitions” [pp]○ pp count = max (rc / 3000, wc / 1000, st / 10 GB)

● Subtle implication: capacity / pp decreases as storage volume increases○ Non-uniform: pp request capacity halved when shard splits

● Subtle implication 2: bulk loads will wreck your planning

Page 65: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

“Best practices for tables”

● Bulk load 20M items = 20 GB● Target 30 minutes = 11,000 write capacity = 11 pps

● Post bulk load steady state 200 req/s = 18 req/pp● No way to reduce partition count

Page 66: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

DynamoDB provisioning in the wild

● You Probably Shouldn’t Use DynamoDB○ Hacker News Discussion

● The Million Dollar Engineering Challenge○ Hacker News discussion

Page 67: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

CosmosDB

● Like DynamoDB, but underdocumented● Partition and scale in Azure Cosmos DB

○ Unspecified: max pp storage size, max pp request capacity

Page 68: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Spanner

● DFS architecture means it doesn’t have the provisioning problem

● Priced per node (~$8500 min per 2 TB data, per year) + $3600 / TB / yr○ Doesn’t appear to be part of the GCP free usage tier

● Competing with Cloud Datastore, Cloud BigTable

Page 69: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Recommendations

Page 70: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Never recommended

● DynamoDB: Lack of nesting is too limiting● Spanner: All ACID, all the time is too expensive

Page 71: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

CosmosDB vs Cassandra

● Rough feature parity○ Partitioned rows (documents) with nesting○ Default EC with opt-in to stronger options

● Cassandra○ Materialized views○ True multi-model○ Predictable provisioning

● CosmosDB○ Stored procedures○ ORDER BY○ Change feed

Page 72: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

CosmosDB vs Cassandra

● Given rough feature parity, why would you pick the one that only runs in a single cloud?

● Hobbyist / less than one (three) C* VM?

Page 73: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

DataStax Enterprise

Page 74: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Distributed Systems Reading List

● Bigtable: A Distributed Storage System for Structured Data

● Dynamo: Amazon’s Highly Available Key-value Store ● Cassandra - A Decentralized Structured Storage

System [annotated by Jonathan Ellis]● Skew-aware automatic database partitioning in

shared-nothing, parallel OLTP systems● Calvin: Fast Distributed Transactions for Partitioned

Database Systems● Spanner: Google's Globally-Distributed Database

Page 75: Data day texas: Cassandra and the Cloud

© DataStax, All Rights Reserved.

Thank you