Cassandra Virtual Node talk
-
Upload
patrick-mcfadin -
Category
Technology
-
view
3.804 -
download
0
description
Transcript of Cassandra Virtual Node talk
©2012 DataStax1
V is for vnodesPatrick McFadin, Sr Solution ArchitectDataStax
Friday, February 15, 13
©2012 DataStax
Agenda for today
•What is a node?•How vnodes work• Converting your cluster• Benefits
2Friday, February 15, 13
©2012 DataStax
Since the beginning...
3
Cassandra has had...
Clusters, which have...
Keyspaces, which have...
Column Families, which have...
Friday, February 15, 13
©2012 DataStax
Row Keys
4
Unique in a column family Can be up to 64k in size Can be sorted in the cluster
OR...
Byte Ordered Partitioner
Can be randomly placed in cluster
Random Partitioner
Friday, February 15, 13
©2012 DataStax
Row Keys
5
How do you...
• Create a random number?• Make sure the number is big enough?• Make it reproducible?
MD5 does the job
MD5Input a Row Key Get a 128 bit number
Friday, February 15, 13
©2012 DataStax
Row Keys
6
MD5
MD58675309
@PatrickMcFadin 0xcfc2d0610aaa712a8c36711d08a2550a
0x6cc0d36686e6a433aa76f96773852d35
The number produced is a range between:
0 and 2128-1... but Cassandra uses 2127-1
...otherwise known as a HUGE number.
2128 = 340,282,366,920,938,463,463,374,607,431,768,211,456
Input
Input
Get
Get
Friday, February 15, 13
©2012 DataStax7
Friday, February 15, 13
©2012 DataStax
Token Assignment• Each Cassandra node is assigned a token• Each token is a number inside the huge range• Tokens mark the ownership range of Row Keys
8
From: Token = 0
To: Token = 56713727820156410577229101238628035242
To: Token = 113427455640312821154458202477256070484
From:
Friday, February 15, 13
©2012 DataStax
Row Key to Token
9
I’ll take it!
Token = 0
Token = 56713727820156410577229101238628035242
Token = 113427455640312821154458202477256070484
MD5@PatrickMcFadin 276161727147663567581939045564154008842
GetInput
Friday, February 15, 13
©2012 DataStax
Row Key to Token
9
I’ll take it!
Token = 0
Token = 56713727820156410577229101238628035242
Token = 113427455640312821154458202477256070484
MD5@PatrickMcFadin 276161727147663567581939045564154008842
GetInput
Friday, February 15, 13
©2012 DataStax
Row Key to Token
9
I’ll take it!
Token = 0
Token = 56713727820156410577229101238628035242
Token = 113427455640312821154458202477256070484
MD5@PatrickMcFadin 276161727147663567581939045564154008842
GetInput
Friday, February 15, 13
©2012 DataStax
Cassandra 1.1 Node• Responsible for a single range of keys• Range determined by single token• One server = One token = One node
10Friday, February 15, 13
©2012 DataStax
Cassandra 1.1 Node• Responsible for a single range of keys• Range determined by single token• One server = One token = One node
10Friday, February 15, 13
©2012 DataStax
Cassandra 1.1 Node• Responsible for a single range of keys• Range determined by single token• One server = One token = One node
10
Commodity node?
Friday, February 15, 13
©2012 DataStax
Cassandra 1.1 Node• Responsible for a single range of keys• Range determined by single token• One server = One token = One node
10
Commodity node?
Friday, February 15, 13
©2012 DataStax
Cassandra 1.1 Node• Responsible for a single range of keys• Range determined by single token• One server = One token = One node
10
Commodity node? What you really want.
Friday, February 15, 13
©2012 DataStax
Cassandra 1.1 Node• Responsible for a single range of keys• Range determined by single token• One server = One token = One node
10
Commodity node? What you really want.
Friday, February 15, 13
©2012 DataStax
Time for a new plan
• Hardware is only getting bigger• One node is responsible for more data• Token assignments are a pain
11Friday, February 15, 13
©2012 DataStax
Token assignment (sucks)• Tokens need to be evenly spread• Growing a ring... not good options• Shrinking a ring... not good options• Tokens have to be added to each server config
12Friday, February 15, 13
©2012 DataStax
Enter Virtual Nodes• One server should have many nodes• Each node should be small• Tokens should be automatic
13
1-41-41-41-4
Server 1
Version 1.1
1 2
4 3
Server 1
Version 1.2
Friday, February 15, 13
©2012 DataStax
Virtual Node Features• Default 256 Nodes per server• Auto assign tokens• Faster rebuilds of servers• Faster server add to cluster• New partitioner (More later)
14Friday, February 15, 13
©2012 DataStax
Transitioning to vnodes
15
Super easy!
Find these lines in your cassandra.yaml file:
#num_tokens:
initial_token: <some big number>
num_tokens: 256
initial_token:
Change to:
and restart.Repeat on all nodes in cluster
Friday, February 15, 13
©2012 DataStax
Transitioning to vnodes
16
Let’s walk through it...
After all Cassandra instances have been reset
[patrick@cassandra0 ~]$ cassandra-shuffle create
[patrick@cassandra0 ~]$ cassandra-shuffle enable
Initialize a shuffle operation
Enable shuffling
[patrick@cassandra0 ~]$ cassandra-shuffle ls
List pending relocations*
*This is a slow op. Be patient.
Friday, February 15, 13
©2012 DataStax
Existing 1.1 cluster
1-41-41-41-4 4-84-84-84-8
13-1613-1613-1613-16 9-129-129-129-12
Server 1 Server 2
Server 4 Server 3
Friday, February 15, 13
©2012 DataStax18
1-4 1-4
1-4 1-4
4-8 4-8
4-8 4-8
13-16 13-16
13-16 13-16
9-12 9-12
9-12 9-12
Server 1
Server 4
Server 2
Server 3
Set num_tokens and restart
Friday, February 15, 13
©2012 DataStax
Set num_tokens and restart
19
1 2
3 4
5 6
7 8
13 14
15 16
9 10
11 12
Server 1 Server 2
Server 4 Server 3
Initialize and Enable shuffling...Friday, February 15, 13
©2012 DataStax
Shuffle enable
20
1 2
3 4
5 6
7 8
13 14
1516
9 10
12 11
Server 1
Server 4
Server 2
Server 3
Friday, February 15, 13
©2012 DataStax
Shuffle complete
21
1 2
3 4
5 6
7 8
13 14
1516
9 10
12 11
Server 1
Server 4
Server 2
Server 3
Friday, February 15, 13
©2012 DataStax
Ops life with vnodes• Add any number of nodes• No token assignments!• Bigger server? Larger num_tokens• Decommission any number of nodes• New nodetool command: status
22
One more time now!
Friday, February 15, 13
©2012 DataStax
Bonus new thing• New Partitioner: Murmur3Partitoner• Murmur3 replaces MD5• Slightly faster than MD5 in certain cases• Go forward partitioner for NEW clusters• No need to convert
23
More details here:https://issues.apache.org/jira/browse/CASSANDRA-3772
Friday, February 15, 13
©2012 DataStax
In conclusion...
24
Go out and try some vnode love today!
Download Cassandra 1.2 now
http://www.datastax.com/download/community
http://cassandra.apache.org/download/
Friday, February 15, 13
©2012 DataStax
Some handy references
25
http://www.datastax.com/dev/blog/upgrading-an-existing-cluster-to-vnodes
http://www.datastax.com/dev/blog/virtual-nodes-in-cassandra-1-2
Follow me on Twitter for more: @PatrickMcFadin
Friday, February 15, 13
©2012 DataStax26
We power the apps that transform
business.
Friday, February 15, 13