Sharding with MongoDB -- MongoDC 2012
-
Upload
tyler-brock -
Category
Technology
-
view
1.652 -
download
2
Transcript of Sharding with MongoDB -- MongoDC 2012
Sharding with MongoDB
Tyler Brock@TylerBrock
Philosophy
Concepts
Architecture
Mechanics
Philosophy
Philosophy
“Ruby is designed to make
programmers happy.”
Philosophy
Philosophy
MongoDB is a database for developers.
Build
Philosophy
BuildScale
Philosophy
How does MongoDB scale?
Philosophy
> db.runCommand({enablesharding: "<dbname>" })
> db.runCommand({ shardcollection: "<namespace>", key: <shardkeypatternobject> })
Philosophy
Concepts
datastore
app
Read/Write
Simple Web Application
What happens when your working set exceeds memory?
What happens if your write load is enormous?
datastore
app
Vertical Scaling
app
Vertical Scaling
datastore
app
Vertical Scaling
datastore
appapp
68 GB RamRaid10 EBS
datastore
app
Vertical Scaling
appapp
512 GB RamRaid10 SSD
Horizontal Scaling
app
datastoredatastoredatastore
60gb
Horizontal Scaling
app
datastoredatastore datastore
20gb 20gb 20gb
Horizontal Scaling
Routing Logic
app
datastoredatastore datastore
20gb 20gb 20gb
metadata
Horizontal Scaling
Routing Logic
app
datastoredatastore datastore
20gb 20gb
metadata
60gb
Horizontal Scaling
app
Routing Logic
Balancer
datastoredatastore datastore
20gb 20gb
metadata
60gb
Horizontal Scaling
app
Routing Logic
Balancer
datastoredatastore datastore
metadata
30gb 30gb 30gb
Horizontal Scaling
Architecture
Really is just a mongod (or replica set)Where your data lives
mongod
Shard
Mongod started with --configsvr optionMust have 3 (or 1 in development)Data is commited using 2 phase commit
config
Config Server
mongos
Acts just like shard router / proxyOne or as many as you wantLight weight -- can run on App serversCaches meta-data from config servers
mongos
Routing Logic
Balancingmetadata
datastore datastoredatastore
metadata
datastore
mongos
datastoredatastore
metadata
datastore
mongos
datastoredatastore
app
datastore
mongos
config
datastoredatastore
app
datastore
mongos
config
datastoredatastore
config
config
app
mongos
config
mongod mongodmongod
config
config
app
mongos
config
mongod mongodmongod
mongod mongodmongod
mongod mongodmongod
RS RS RS
config
config
app
mongos
config
mongod mongodmongod
mongod mongodmongod
mongod mongodmongod
RS RS RS
config
config
app
Configuration
mongodmongodmongod
Bring up mongods or Replica Sets
mongod mongodmongod
mongod mongodmongod
RS RS RS
mongod --shardsvrmongod --replSet --shardsvr
config
mongodmongodmongod
mongod mongodmongod
mongod mongodmongod
RS RS RS
Bring up Config Servers
config
config
mongod --configsvr
config
mongodmongodmongod
mongod mongodmongod
mongod mongodmongod
RS RS RS
Bring up Mongos
config
config
mongos
mongos --configdb <list of configdb uris>
> use admin> db.runCommand({"addShard": <shard uri>})
Connect to Mongos+ Add Shards
+Enable Sharding
> db.runCommand( { enablesharding : "<dbname>" } );
> db.runCommand( { shardcollection : "<namespace>", key : <key> });
+Shard a Collection
Mechanics
How does MongoDB balance my data?
{ name: “Joe”, email: “[email protected]”,},{ name: “Bob”, email: “[email protected]”,},{ name: “Tyler”, email: “[email protected]”,}
Keys
test.users
> db.runCommand({
})
{ name: “Joe”, email: “[email protected]”,},{ name: “Bob”, email: “[email protected]”,},{ name: “Tyler”, email: “[email protected]”,}
shardcollection: “test.users”,
Keys
key: { email: 1 }
test.users
{ name: “Joe”, email: “[email protected]”,},{ name: “Bob”, email: “[email protected]”,},{ name: “Tyler”, email: “[email protected]”,}
shardcollection: “test.users”,
Keys
key: { email: 1 }
test.users
{ name: “Joe”, email: “[email protected]”,},{ name: “Bob”, email: “[email protected]”,},{ name: “Tyler”, email: “[email protected]”,}
Keys
key: { email: 1 }
test.users
Chunks
-∞ +∞
Chunks
-∞ +∞
Chunks
-∞ +∞
Split!
Chunks
-∞ +∞
Split!This is a chunk
This is a chunk
Chunks
-∞ +∞
Chunks
-∞ +∞
Chunks
-∞ +∞
Split!
Splitting
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
Splitting
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
Splitting
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
Splitting
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
Splitting
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
Split this big chunk into 2
chunks
Splitting
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
Splitting
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
These chunks have split
Balancing
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
Balancing
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
Shard1, move a chunk to
Shard2
Balancing
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
Balancing
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
Shard1, move another chunk
to Shard3
Balancing
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
Balancing
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
Shard1, move another chunk
to Shard4
Balancing
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
Balancing
config
config
config
mongos
Shard 1 Shard 2 Shard 3 Shard 4
How does MongoDB route my queries?
Routed Request
mongos
shard shard shard
Routed Request1
mongos
shard shard shard
1. Query arrives at Mongos
Routed Request1
2
mongos
shard shard shard
1. Query arrives at Mongos
2. Mongos routes query to a single shard
Routed Request1
2
3
mongos
shard shard shard
1. Query arrives at Mongos
2. Mongos routes query to a single shard
3. Shard returns results of query
Routed Request1
2
3
4
mongos
shard shard shard
1. Query arrives at Mongos
2. Mongos routes query to a single shard
3. Shard returns results of query
4. Results returned to client
Scatter Gather Request
shard shard shard
mongos
Scatter Gather Request1
1. Query arrives at Mongos
shard shard shard
mongos
Scatter Gather Request1
1. Query arrives at Mongos
2 22
shard shard shard
mongos2. Mongos broadcasts queryto all shards
Scatter Gather Request1
1. Query arrives at Mongos
2 22
3 33
shard shard shard
mongos2. Mongos broadcasts queryto all shards
3. Each shard returns resultsfor query
Scatter Gather Request1
41. Query arrives at Mongos
2 22
3 33
shard shard shard
mongos2. Mongos broadcasts queryto all shards
3. Each shard returns resultsfor query
4. Results combined andreturned to client
mongos
Distributed Merge Sort Req.
shard shard shard
mongos
Distributed Merge Sort Req.1
shard shard shard
1. Query arrives at Mongos
mongos
Distributed Merge Sort Req.1
22 2
shard shard shard
1. Query arrives at Mongos
2. Mongos broadcasts query to all shards
mongos
Distributed Merge Sort Req.1
22 2
shard shard shard3 3 3
1. Query arrives at Mongos
2. Mongos broadcasts query to all shards
3. Each shard locally sorts results
mongos
Distributed Merge Sort Req.1
22 2
4 44
shard shard shard3 3 3
1. Query arrives at Mongos
2. Mongos broadcasts query to all shards
3. Each shard locally sorts results
4. Results returned to mongos
mongos
Distributed Merge Sort Req.1
5
22 2
4 44
shard shard shard3 3 3
1. Query arrives at Mongos
2. Mongos broadcasts query to all shards
3. Each shard locally sorts results
4. Results returned to mongos
5. Mongos merges sorted results
mongos
Distributed Merge Sort Req.1
6
5
22 2
4 44
shard shard shard3 3 3
1. Query arrives at Mongos
2. Mongos broadcasts query to all shards
3. Each shard locally sorts results
4. Results returned to mongos
5. Mongos merges sorted results
6. Combined results returned to client
Queries
By Shard Key Routed db.users.find({email: “[email protected]”})
Sorted by shard key
Routed in order db.users.find().sort({email:-1})
Find by non shard key
Scatter Gather db.users.find({state:”NY”})
Sorted by non shard key
Distributed merge sort
db.users.find().sort({state:1})
Writes
Inserts Requires shard key db.users.insert({ name: “Bob”, email: “[email protected]”})
Removes Routed db.users.delete({ email: “[email protected]”})
Removes
Scattered db.users.delete({name: “Bob”})
Updates Routed db.users.update( {email: “[email protected]”}, {$set: { state: “NY”}})
Updates
Scattered db.users.update( {state: “CA”}, {$set:{ state: “NY”}} )
How do I choose my shard key?
Choose a field that is common to your queries.
Rule of Thumb
Cardinality
Chunks should be able to split.
Bad {node: 1}
{ node: "ny153.example.com", application: "apache", time: "2011-01-02T21:21:56Z", level: "ERROR", msg: "something is broken"}
Chunks should be able to split
Better {node:1, time:1}
Bad {node: 1}
{ node: "ny153.example.com", application: "apache", time: "2011-01-02T21:21:56Z", level: "ERROR", msg: "something is broken"}
Chunks should be able to split
Write Scaling
Writes should be distributed.
{ node: "ny153.example.com", application: "apache", time: "2011-01-02T21:21:56Z", level: "ERROR", msg: "something is broken"}
Bad { time : 1 }
Writes should be distributed
{ node: "ny153.example.com", application: "apache", time: "2011-01-02T21:21:56Z", level: "ERROR", msg: "something is broken"}
Bad { time : 1 }
Better {node:1, application:1, time:1}
Writes should be distributed
Query Isolation & Data Locality
Queries should be routed to one shard.
Bad {msg: 1, node: 1}
{ node: "ny153.example.com", application: "apache", time: "2011-01-02T21:21:56Z", level: "ERROR", msg: "something is broken”}
Queries should be routed to one shard
Better {node: 1, time: 1}
Bad {msg: 1, node: 1}
{ node: "ny153.example.com", application: "apache", time: "2011-01-02T21:21:56Z", level: "ERROR", msg: "something is broken”}
Queries should be routed to one shard
> db.runCommand({enablesharding: "<dbname>" })
> db.runCommand({ shardcollection: "<namespace>", key: <shardkeypatternobject> })
Thanks!
Extra Slides
Config Servers
Config Servers
mongod
Config Servers
mongod
mongod
mongod
mongoDB Scaling - Single Node
write
read
node_a1
Read scaling - add Replicas
write
read
node_b1
node_a1
Read scaling - add Replicas
write
read
node_c1
node_b1
node_a1
Write scaling - Sharding
shard1
write
read
node_c1
node_b1
node_a1
Write scaling - add shards
write
read
shard1
node_c1
node_b1
node_a1
shard2
node_c2
node_b2
node_a2
Write scaling - add shards
write
read
shard1
node_c1
node_b1
node_a1
shard2
node_c2
node_b2
node_a2
shard3
node_c3
node_b3
node_a3