Cassandra Virtual Node talk

33
©2012 DataStax 1 V is for vnodes Patrick McFadin, Sr Solution Architect DataStax Friday, February 15, 13

description

 

Transcript of Cassandra Virtual Node talk

Page 1: Cassandra Virtual Node talk

©2012 DataStax1

V is for vnodesPatrick McFadin, Sr Solution ArchitectDataStax

Friday, February 15, 13

Page 2: Cassandra Virtual Node talk

©2012 DataStax

Agenda for today

•What is a node?•How vnodes work• Converting your cluster• Benefits

2Friday, February 15, 13

Page 3: Cassandra Virtual Node talk

©2012 DataStax

Since the beginning...

3

Cassandra has had...

Clusters, which have...

Keyspaces, which have...

Column Families, which have...

Friday, February 15, 13

Page 4: Cassandra Virtual Node talk

©2012 DataStax

Row Keys

4

Unique in a column family Can be up to 64k in size Can be sorted in the cluster

OR...

Byte Ordered Partitioner

Can be randomly placed in cluster

Random Partitioner

Friday, February 15, 13

Page 5: Cassandra Virtual Node talk

©2012 DataStax

Row Keys

5

How do you...

• Create a random number?• Make sure the number is big enough?• Make it reproducible?

MD5 does the job

MD5Input a Row Key Get a 128 bit number

Friday, February 15, 13

Page 6: Cassandra Virtual Node talk

©2012 DataStax

Row Keys

6

MD5

MD58675309

@PatrickMcFadin 0xcfc2d0610aaa712a8c36711d08a2550a

0x6cc0d36686e6a433aa76f96773852d35

The number produced is a range between:

0 and 2128-1... but Cassandra uses 2127-1

...otherwise known as a HUGE number.

2128 = 340,282,366,920,938,463,463,374,607,431,768,211,456

Input

Input

Get

Get

Friday, February 15, 13

Page 7: Cassandra Virtual Node talk

©2012 DataStax7

Friday, February 15, 13

Page 8: Cassandra Virtual Node talk

©2012 DataStax

Token Assignment• Each Cassandra node is assigned a token• Each token is a number inside the huge range• Tokens mark the ownership range of Row Keys

8

From: Token = 0

To: Token = 56713727820156410577229101238628035242

To: Token = 113427455640312821154458202477256070484

From:

Friday, February 15, 13

Page 9: Cassandra Virtual Node talk

©2012 DataStax

Row Key to Token

9

I’ll take it!

Token = 0

Token = 56713727820156410577229101238628035242

Token = 113427455640312821154458202477256070484

MD5@PatrickMcFadin 276161727147663567581939045564154008842

GetInput

Friday, February 15, 13

Page 10: Cassandra Virtual Node talk

©2012 DataStax

Row Key to Token

9

I’ll take it!

Token = 0

Token = 56713727820156410577229101238628035242

Token = 113427455640312821154458202477256070484

MD5@PatrickMcFadin 276161727147663567581939045564154008842

GetInput

Friday, February 15, 13

Page 11: Cassandra Virtual Node talk

©2012 DataStax

Row Key to Token

9

I’ll take it!

Token = 0

Token = 56713727820156410577229101238628035242

Token = 113427455640312821154458202477256070484

MD5@PatrickMcFadin 276161727147663567581939045564154008842

GetInput

Friday, February 15, 13

Page 12: Cassandra Virtual Node talk

©2012 DataStax

Cassandra 1.1 Node• Responsible for a single range of keys• Range determined by single token• One server = One token = One node

10Friday, February 15, 13

Page 13: Cassandra Virtual Node talk

©2012 DataStax

Cassandra 1.1 Node• Responsible for a single range of keys• Range determined by single token• One server = One token = One node

10Friday, February 15, 13

Page 14: Cassandra Virtual Node talk

©2012 DataStax

Cassandra 1.1 Node• Responsible for a single range of keys• Range determined by single token• One server = One token = One node

10

Commodity node?

Friday, February 15, 13

Page 15: Cassandra Virtual Node talk

©2012 DataStax

Cassandra 1.1 Node• Responsible for a single range of keys• Range determined by single token• One server = One token = One node

10

Commodity node?

Friday, February 15, 13

Page 16: Cassandra Virtual Node talk

©2012 DataStax

Cassandra 1.1 Node• Responsible for a single range of keys• Range determined by single token• One server = One token = One node

10

Commodity node? What you really want.

Friday, February 15, 13

Page 17: Cassandra Virtual Node talk

©2012 DataStax

Cassandra 1.1 Node• Responsible for a single range of keys• Range determined by single token• One server = One token = One node

10

Commodity node? What you really want.

Friday, February 15, 13

Page 18: Cassandra Virtual Node talk

©2012 DataStax

Time for a new plan

• Hardware is only getting bigger• One node is responsible for more data• Token assignments are a pain

11Friday, February 15, 13

Page 19: Cassandra Virtual Node talk

©2012 DataStax

Token assignment (sucks)• Tokens need to be evenly spread• Growing a ring... not good options• Shrinking a ring... not good options• Tokens have to be added to each server config

12Friday, February 15, 13

Page 20: Cassandra Virtual Node talk

©2012 DataStax

Enter Virtual Nodes• One server should have many nodes• Each node should be small• Tokens should be automatic

13

1-41-41-41-4

Server 1

Version 1.1

1 2

4 3

Server 1

Version 1.2

Friday, February 15, 13

Page 21: Cassandra Virtual Node talk

©2012 DataStax

Virtual Node Features• Default 256 Nodes per server• Auto assign tokens• Faster rebuilds of servers• Faster server add to cluster• New partitioner (More later)

14Friday, February 15, 13

Page 22: Cassandra Virtual Node talk

©2012 DataStax

Transitioning to vnodes

15

Super easy!

Find these lines in your cassandra.yaml file:

#num_tokens:

initial_token: <some big number>

num_tokens: 256

initial_token:

Change to:

and restart.Repeat on all nodes in cluster

Friday, February 15, 13

Page 23: Cassandra Virtual Node talk

©2012 DataStax

Transitioning to vnodes

16

Let’s walk through it...

After all Cassandra instances have been reset

[patrick@cassandra0 ~]$ cassandra-shuffle create

[patrick@cassandra0 ~]$ cassandra-shuffle enable

Initialize a shuffle operation

Enable shuffling

[patrick@cassandra0 ~]$ cassandra-shuffle ls

List pending relocations*

*This is a slow op. Be patient.

Friday, February 15, 13

Page 24: Cassandra Virtual Node talk

©2012 DataStax

Existing 1.1 cluster

1-41-41-41-4 4-84-84-84-8

13-1613-1613-1613-16 9-129-129-129-12

Server 1 Server 2

Server 4 Server 3

Friday, February 15, 13

Page 25: Cassandra Virtual Node talk

©2012 DataStax18

1-4 1-4

1-4 1-4

4-8 4-8

4-8 4-8

13-16 13-16

13-16 13-16

9-12 9-12

9-12 9-12

Server 1

Server 4

Server 2

Server 3

Set num_tokens and restart

Friday, February 15, 13

Page 26: Cassandra Virtual Node talk

©2012 DataStax

Set num_tokens and restart

19

1 2

3 4

5 6

7 8

13 14

15 16

9 10

11 12

Server 1 Server 2

Server 4 Server 3

Initialize and Enable shuffling...Friday, February 15, 13

Page 27: Cassandra Virtual Node talk

©2012 DataStax

Shuffle enable

20

1 2

3 4

5 6

7 8

13 14

1516

9 10

12 11

Server 1

Server 4

Server 2

Server 3

Friday, February 15, 13

Page 28: Cassandra Virtual Node talk

©2012 DataStax

Shuffle complete

21

1 2

3 4

5 6

7 8

13 14

1516

9 10

12 11

Server 1

Server 4

Server 2

Server 3

Friday, February 15, 13

Page 29: Cassandra Virtual Node talk

©2012 DataStax

Ops life with vnodes• Add any number of nodes• No token assignments!• Bigger server? Larger num_tokens• Decommission any number of nodes• New nodetool command: status

22

One more time now!

Friday, February 15, 13

Page 30: Cassandra Virtual Node talk

©2012 DataStax

Bonus new thing• New Partitioner: Murmur3Partitoner• Murmur3 replaces MD5• Slightly faster than MD5 in certain cases• Go forward partitioner for NEW clusters• No need to convert

23

More details here:https://issues.apache.org/jira/browse/CASSANDRA-3772

Friday, February 15, 13

Page 31: Cassandra Virtual Node talk

©2012 DataStax

In conclusion...

24

Go out and try some vnode love today!

Download Cassandra 1.2 now

http://www.datastax.com/download/community

http://cassandra.apache.org/download/

Friday, February 15, 13

Page 32: Cassandra Virtual Node talk

©2012 DataStax

Some handy references

25

http://www.datastax.com/dev/blog/upgrading-an-existing-cluster-to-vnodes

http://www.datastax.com/dev/blog/virtual-nodes-in-cassandra-1-2

Follow me on Twitter for more: @PatrickMcFadin

Friday, February 15, 13

Page 33: Cassandra Virtual Node talk

©2012 DataStax26

We power the apps that transform

business.

Friday, February 15, 13