Who wants to be a Cassandra Millionaire

65
Cassandra Installation to Optimization 40-minutes of best practice and resources @VictorFAnj os

Transcript of Who wants to be a Cassandra Millionaire

Page 1: Who wants to be a Cassandra Millionaire

Cassandra Installation to Optimization40-minutes of best practice and resources

@VictorFAnjos

Page 2: Who wants to be a Cassandra Millionaire

2© 2015. All Rights Reserved. @VictorFAnjos

Page 3: Who wants to be a Cassandra Millionaire

© Mark E. Damon - All Rights Reserved

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

Welcome to

Who Wants to perform 1M ops/s

50:50

@VictorFAnjos

Page 4: Who wants to be a Cassandra Millionaire

© Mark E. Damon - All Rights Reserved

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

@VictorFAnjos

Page 5: Who wants to be a Cassandra Millionaire

5© 2015. All Rights Reserved.

A: NAS / SAN

C: DAS SATA

B: SSD

D: DAS SCSI

50:50

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

This storage medium allows for best performance.

@VictorFAnjos

Page 6: Who wants to be a Cassandra Millionaire

6© 2015. All Rights Reserved. @VictorFAnjos

A: NAS / SAN

C: DAS SATA

B: SSD

D: DAS SCSI

50:50

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

@VictorFAnjos

This storage medium allows for best performance.

Page 7: Who wants to be a Cassandra Millionaire

Installation and considerations

how to store the datastore

Storage Area Network Solid State Drive

7© 2015. All Rights Reserved. @VictorFAnjos

Page 8: Who wants to be a Cassandra Millionaire

Installation and considerations

how to store the datastore

Local (DAS), iSCSI, Fiber Channel

8© 2015. All Rights Reserved. @VictorFAnjos

● AVOID network storage like the plague

● Direct Attached Storage FTW

● Disk latency is a HUGE deal for performance

Page 9: Who wants to be a Cassandra Millionaire

Installation and considerations

how to store the datastore

9© 2015. All Rights Reserved. 9@VictorFAnjos

SATA/SAS DAS

PCIe/NVMe DAS

Page 10: Who wants to be a Cassandra Millionaire

Installation and considerations

how to store the datastore

10© 2015. All Rights Reserved. @VictorFAnjos

Page 11: Who wants to be a Cassandra Millionaire

© Mark E. Damon - All Rights Reserved

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

@VictorFAnjos

Page 12: Who wants to be a Cassandra Millionaire

12© 2015. All Rights Reserved. @VictorFAnjos

A: ZFS

C: Ext4

B: Btrfs

D: F2FS

50:50

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

When using SSDs, this filesystem type

is best.

@VictorFAnjos

Page 13: Who wants to be a Cassandra Millionaire

13© 2015. All Rights Reserved. @VictorFAnjos

A: ZFS

C: Ext4

B: Btrfs

D: F2FS

50:50

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

When using SSDs, this filesystem type

is best.

@VictorFAnjos

Page 14: Who wants to be a Cassandra Millionaire

© Mark E. Damon - All Rights Reserved

Congratulations!

You’ve Reachedthe 1,000 ops/s

Milestone!

Congratulations!Congratulations!@VictorFAnjos

Page 15: Who wants to be a Cassandra Millionaire

Installation and considerations

i can’t believe it’s not btrfs

15© 2015. All Rights Reserved. @VictorFAnjos

● easiest to use ext4 (it’s on most linux distros), but F2FS get 5-10% gains in write performance

● if NOT using F2FS, make sure to TRIM

● multiple disks → use RAID0

Page 16: Who wants to be a Cassandra Millionaire

© Mark E. Damon - All Rights Reserved

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

@VictorFAnjos

Page 17: Who wants to be a Cassandra Millionaire

17© 2015. All Rights Reserved. @VictorFAnjos

A: 0

C: Equal to HEAP

B: ½ of HEAP

D: EQUAL TO RAM

50:50

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

This is the sweetspot for SWAP

when using C*

@VictorFAnjos

Page 18: Who wants to be a Cassandra Millionaire

18© 2015. All Rights Reserved. @VictorFAnjos

A: 0

C: Equal to HEAP

B: ½ of HEAP

D: EQUAL TO RAM

50:50

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

This is the sweetspot for SWAP

when using C*

@VictorFAnjos

Page 19: Who wants to be a Cassandra Millionaire

Installation and considerations

to swap or not to swap

19© 2015. All Rights Reserved. @VictorFAnjos

Page 20: Who wants to be a Cassandra Millionaire

© Mark E. Damon - All Rights Reserved

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

@VictorFAnjos

Page 21: Who wants to be a Cassandra Millionaire

21© 2015. All Rights Reserved. @VictorFAnjos

A: 64G

C: 16G

B: 32G

D: 8G

50:50

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

Having 64G of RAM means you should optimize to have ___G of HEAP.

@VictorFAnjos

Page 22: Who wants to be a Cassandra Millionaire

22© 2015. All Rights Reserved. @VictorFAnjos

A: 64G

C: 16G

B: 32G

D: 8G

50:50

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

@VictorFAnjos

Having 64G of RAM means you should optimize to have ___G of HEAP.

Page 23: Who wants to be a Cassandra Millionaire

Installation and considerations

how much heap?

23© 2015. All Rights Reserved. @VictorFAnjos

http://docs.datastax.com/en/cassandra/1.2/cassandra/operations/ops_tune_jvm_c.html

Page 24: Who wants to be a Cassandra Millionaire

© Mark E. Damon - All Rights Reserved

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

@VictorFAnjos

Page 25: Who wants to be a Cassandra Millionaire

25© 2015. All Rights Reserved. @VictorFAnjos

A: EC2Snitch

C: Simple Snitch

B: Dynamic Snitch

D: Property File Snitch

50:50

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

Definitely DO NOT use this snitch in

Multi-DC environments.

@VictorFAnjos

Page 26: Who wants to be a Cassandra Millionaire

26© 2015. All Rights Reserved. @VictorFAnjos

A: EC2Snitch

C: Simple Snitch

B: Dynamic Snitch

D: Property File Snitch

50:50

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

@VictorFAnjos

Definitely DO NOT use this snitch in

Multi-DC environments.

Page 27: Who wants to be a Cassandra Millionaire

Installation and considerations

son of a snitch

27© 2015. All Rights Reserved. @VictorFAnjos

Page 28: Who wants to be a Cassandra Millionaire

Installation and considerations

son of a snitch

28© 2015. All Rights Reserved. @VictorFAnjos

Page 29: Who wants to be a Cassandra Millionaire

© Mark E. Damon - All Rights Reserved

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

@VictorFAnjos

Page 30: Who wants to be a Cassandra Millionaire

30© 2015. All Rights Reserved. @VictorFAnjos

A: Synchronous AND Full Queries

C: Synchronous AND Prepared Statements

B: Asynchronous AND Prepared Statements

D: Asynchronous AND Full Queries

50:50

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

To reduce latency and wire time to my app, I should opt for.

@VictorFAnjos

Page 31: Who wants to be a Cassandra Millionaire

31© 2015. All Rights Reserved. @VictorFAnjos

A: Synchronous AND Full Queries

C: Synchronous AND Prepared Statements

B: Asynchronous AND Prepared Statements

D: Asynchronous AND Full Queries

50:50

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

To reduce latency and wire time to my app, I should opt for.

@VictorFAnjos

Page 32: Who wants to be a Cassandra Millionaire

Achieving performance through code/drivers

should I stay or should I go

32© 2015. All Rights Reserved. @VictorFAnjos

● Client writes to any Cassandra node

● Coordinator node replicates to other nodes (in local and remote Data Center)

● Local write acks returned to coordinator

● Client gets ack when enough total nodes are committed

● Data written to internal commit log disks

● When data arrives, remote node replicates data

MULTI DC

● Ack direct to source region coordinator

● Remote copies written to commit log disks

lf a node or region goes offline, hinted handoff completes the write when the node comes back up (as long as there are enough nodes to satisfy consistency level).

Page 33: Who wants to be a Cassandra Millionaire

Achieving performance through code/drivers

should I stay or should I go

33© 2015. All Rights Reserved. @VictorFAnjos

Prepare ONCE...

Bind and Execute multiple times.

Page 34: Who wants to be a Cassandra Millionaire

Achieving performance through code/drivers

should I stay or should I go

34© 2015. All Rights Reserved. @VictorFAnjos

Page 35: Who wants to be a Cassandra Millionaire

© Mark E. Damon - All Rights Reserved

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

@VictorFAnjos

Page 36: Who wants to be a Cassandra Millionaire

36© 2015. All Rights Reserved. @VictorFAnjos

A: 1 / 1 = 1

C: 2 * 1 = 2

B: 2 / 1 = 2

D: 2 / 2 + 1 = 2

50:50

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

With RF=2 and CL=Quorum, operations failed when 1 node went down because of this.

@VictorFAnjos

Page 37: Who wants to be a Cassandra Millionaire

37© 2015. All Rights Reserved. @VictorFAnjos

A: 1 / 1 = 1

C: 2 * 1 = 2

B: 2 / 1 = 2

D: 2 / 2 + 1 = 2

50:50

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

With RF=2 and CL=Quorum, operations failed when 1 node went down because of this.

@VictorFAnjos

Page 38: Who wants to be a Cassandra Millionaire

© Mark E. Damon - All Rights Reserved

Congratulations!

You’ve Reachedthe 32,000 ops/s

Milestone!

Congratulations!Congratulations!@VictorFAnjos

Page 39: Who wants to be a Cassandra Millionaire

Achieving performance through code/drivers

when friends aren’t enough

39© 2015. All Rights Reserved. @VictorFAnjos

Replication Factor = 3

Insert into a cluster of size 6 with consistency Quorum

Two nodes in token range must be present for write to succeed

Page 40: Who wants to be a Cassandra Millionaire

Achieving performance through code/drivers

when friends aren’t enough

40© 2015. All Rights Reserved. @VictorFAnjos

What happens now?

Cannot achieve consistency level QUORUM

Cannot achieve consistency level QUORUM

Cannot achieve consistency level QUORUM

Cannot achieve consistency level QUORUM

Nodes in partition key DOWN

Page 41: Who wants to be a Cassandra Millionaire

© Mark E. Damon - All Rights Reserved

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

@VictorFAnjos

Page 42: Who wants to be a Cassandra Millionaire

42© 2015. All Rights Reserved. @VictorFAnjos

A: Truth table

C: CAP Theorem

B: Brewer’s Theorem

D: Entropy

50:50

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

This mathematical and CS concept helps when data modeling for query

optimization.

@VictorFAnjos

Page 43: Who wants to be a Cassandra Millionaire

43© 2015. All Rights Reserved. @VictorFAnjos

A: Truth table

C: CAP Theorem

B: Brewer’s Theorem

D: Entropy

50:50

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

@VictorFAnjos

This mathematical and CS concept helps when data modeling for query

optimization.

Page 44: Who wants to be a Cassandra Millionaire

Data modelling, CQLSH and more

the truth shall set you free

44© 2015. All Rights Reserved. @VictorFAnjos

Motivated by CS, Math, Engineering

Allows for creating building blocks that yield a single output

More complex truth tables can arise

Page 45: Who wants to be a Cassandra Millionaire

Data modelling, CQLSH and more

the truth shall set you free

45© 2015. All Rights Reserved. @VictorFAnjos

How about searching for username?

And what about full_name?

user_stream

← ← ← Partition Key → → → user_id username full_name

1 0 0

0 1 0

0 0 1

Page 46: Who wants to be a Cassandra Millionaire

© Mark E. Damon - All Rights Reserved

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

@VictorFAnjos

Page 47: Who wants to be a Cassandra Millionaire

47© 2015. All Rights Reserved. @VictorFAnjos

A: Reads / Batches

C: Writes / Deletes

B: Writes / Batches

D: Reads / Deletes

50:50

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

A shift in paradigms, what should you

maximize and reduce for good performance.

@VictorFAnjos

Page 48: Who wants to be a Cassandra Millionaire

48© 2015. All Rights Reserved. @VictorFAnjos

A: Reads / Batches

C: Writes / Deletes

B: Writes / Batches

D: Reads / Deletes

50:50

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

@VictorFAnjos

A shift in paradigms, what should you

maximize and reduce for good performance.

Page 49: Who wants to be a Cassandra Millionaire

Data modelling, CQLSH and more

do the write thing

49© 2015. All Rights Reserved. @VictorFAnjos

Page 50: Who wants to be a Cassandra Millionaire

Data modelling, CQLSH and more

do the write thing

50© 2015. All Rights Reserved. @VictorFAnjos

memtable --- < 100ns

commit log --- ~ 1 ms

DELETES / TTL cause compactions

Page 51: Who wants to be a Cassandra Millionaire

Data modelling, CQLSH and more

do the write thing

51© 2015. All Rights Reserved. @VictorFAnjos

Page 52: Who wants to be a Cassandra Millionaire

© Mark E. Damon - All Rights Reserved

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

@VictorFAnjos

Page 53: Who wants to be a Cassandra Millionaire

53© 2015. All Rights Reserved. @VictorFAnjos

A: ACID

C: Rollback

B: Vector

D: Sharding

50:50

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

To not hit a 2B record limit (per row), this

RDBMS borrowed term can still makes sense.

@VictorFAnjos

Page 54: Who wants to be a Cassandra Millionaire

54© 2015. All Rights Reserved. @VictorFAnjos

A: ACID

C: Rollback

B: Vector

D: Sharding

50:50

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

To not hit a 2B record limit (per row), this

RDBMS borrowed term can still makes sense.

@VictorFAnjos

Page 55: Who wants to be a Cassandra Millionaire

Data modelling, CQLSH and more

sit on this and rotate

55© 2015. All Rights Reserved. @VictorFAnjos

Page 56: Who wants to be a Cassandra Millionaire

© Mark E. Damon - All Rights Reserved

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

@VictorFAnjos

Page 57: Who wants to be a Cassandra Millionaire

57© 2015. All Rights Reserved. @VictorFAnjos

A: Batches

C: Secondary Indexes

B: Synchronous

D: MySQL

50:50

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

Many say to use sparingly, I would say, avoid like the plague.

@VictorFAnjos

Page 58: Who wants to be a Cassandra Millionaire

58© 2015. All Rights Reserved. @VictorFAnjos

A: Batches

C: Secondary Indexes

B: Synchronous

D: MySQL

50:50

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

@VictorFAnjos

Many say to use sparingly, I would say, avoid like the plague.

Page 59: Who wants to be a Cassandra Millionaire

Performance must-haves

never be second best

59© 2015. All Rights Reserved. @VictorFAnjos

writes are distributed among the cluster

each partition key refers to one exact position in which to get a row

but what do we do when we don’t have exactly the right type of index to specify a queryCREATE TABLE users ( user varchar, email varchar, state varchar, PRIMARY KEY (user));

-- OPTION 1 : create an indexCREATE INDEX idxUBS on users (state);

-- OPTION 2 : create another table (store data twice)CREATE TABLE usersByState ( state varchar, user varchar, PRIMARY KEY (state, user));

Page 60: Who wants to be a Cassandra Millionaire

© Mark E. Damon - All Rights Reserved

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

@VictorFAnjos

Page 61: Who wants to be a Cassandra Millionaire

61© 2015. All Rights Reserved. @VictorFAnjos

A: UDT

C: JSON

B: Lightweight Transactions

D: Hinted handoff

50:50

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

This recent addition to C* now helps with ACID like transactions, at a bit

of a performance hit.

@VictorFAnjos

Page 62: Who wants to be a Cassandra Millionaire

62© 2015. All Rights Reserved. @VictorFAnjos

A: UDT

C: JSON

B: Lightweight Transactions

D: Hinted handoff

50:50

151413121110987654321

1 Million500,000250,000125,00064,00032,00016,0008,0004,0002,0001,000500300200100

This recent addition to C* now helps with ACID like transactions, at a bit

of a performance hit.

@VictorFAnjos

Page 63: Who wants to be a Cassandra Millionaire

Performance must-haves

slimfast agreement

63© 2015. All Rights Reserved. @VictorFAnjos

Prepares a proposal that is sent to a number of Acceptors.Waits on a an acknowledgement (in form of promise) from Acceptors.Sends accept message to Quorum of Acceptors with new value to commit.Returns success? completion to client.

Determines if proposal is newer than what it has seen.Acknowledges/agree with its own highest proposal value seen AND the current value (of what is to be set).Receive message to commit new value.Accept and return on successful commit of value.

Page 64: Who wants to be a Cassandra Millionaire

64© 2015. All Rights Reserved. @VictorFAnjos

Performance must-haves

slimfast agreement

Page 65: Who wants to be a Cassandra Millionaire

Thank you