MongoDB Deployment Tips
-
Upload
jared-rosoff -
Category
Technology
-
view
3.011 -
download
0
description
Transcript of MongoDB Deployment Tips
MongoDB deployment tipsJared Rosoff - @forjared
Agenda
• Sizing Machines– Understanding working set – Sizing RAM – Sizing Disk
• Configuring Replica Sets– Understanding failover – Avoiding single points of failure –Minimizing recovery time
Shard 1
Overview
Primary
Secondary
Secondary
Shard 2
Primary
Secondary
Secondary
Shard 3
Primary
Secondary
Secondary
Config
Config
Config
Router Router
Shard 1
Servers and Hardware
Primary
Secondary
Secondary
Shard 2
Primary
Secondary
Secondary
Shard 3
Primary
Secondary
Secondary
Config
Config
Config
Router Router
SIZING RAM AND DISK
Collection 1
Index 1
Virtual Address Space 1
Collection 1
Index 1
Virtual Address Space 1
Collection 1
Index 1 This is your virtual memory size (mapped)
Virtual Address Space 1
Physical RAM
Collection 1
Index 1
Virtual Address Space 1
Physical RAM
Collection 1
Index 1
This is your resident memory size
Virtual Address Space 1
Physical RAM
DiskCollection 1
Index 1
Virtual Address Space 1
Physical RAM
Disk
Virtual Address Space 2
Collection 1
Index 1
Virtual Address Space 1
Physical RAM
DiskCollection 1
Index 1
100 ns
10,000 ns
=
=
Disk configurations
~200 seeks / second
Single Disk
Disk configurations
~200 seeks / second
~200 seeks / second~200 seeks / second~200 seeks / second
Single Disk
RAID 0
Disk configurations
~200 seeks / second
~200 seeks / second~200 seeks / second~200 seeks / second
~400 seeks / second~400 seeks / second~400 seeks / second
Single Disk
RAID 0
RAID 10
SSDs ??
• Seek time of 0.1ms vs 5ms (200 seeks / sec => 10000 seeks / sec)
• But expensive
Tips for sizing hardware
• Know how important page faults are– If you want low latency, avoid page
faults
• Size memory appropriately – To avoid page faults, fit everything in
RAM– Collection Data + Index Data
• Provision disk appropriately– RAID10 is recommended– SSD’s are fast, if you can afford them
Shard 1
Replica Sets
Primary
Secondary
Secondary
Shard 2
Primary
Secondary
Secondary
Shard 3
Primary
Secondary
Secondary
Config
Config
Config
Router Router
UNDERSTANDING AUTOMATIC FAILOVER
Primary Election
Primary
Secondary
Secondary
As long as a partition can see a majority (>50%) of the cluster, then it will elect a primary.
Simple Failure
Primary
Failed Node
Secondary
66% of cluster visible. Primary is elected
Simple Failure
Failed Node
33% of cluster visible. Read only mode.
Failed Node
Secondary
Network Partition
Primary
Secondary
Secondary
Network Partition
Primary
Secondary
Secondary
Primary
Failed Node
Secondary
66% of cluster visible. Primary is elected
Secondary
Network Partition
33% of cluster visible. Read only mode.
Primary
Secondary
Failed Node
Failed Node
Secondary
Even Cluster Size
Primary
Secondary
Secondary
Secondary
Even Cluster Size
Primary
Secondary
Secondary
Secondary
Failed Node
Secondary
Failed Node
50% of cluster visible. Read only mode.
Secondary
Even Cluster Size
Primary
Secondary
Failed Node
Secondary
Failed Node
50% of cluster visible. Read only mode.
Secondary
Secondary
Secondary
AVOIDING SINGLE POINTS OF FAILURE
Avoid Single points of failure
Avoid Single points of failure
Primary
Secondary
Secondary
Top of rack switch
Rack falls over
Better
Primary
Secondary
Secondary
Loss of internet
Building burns down
Better yet
Primary
Secondary
Secondary
San Francisco
Dallas
Priorities
Primary
Secondary
Secondary
San Francisco
Dallas
Priority 1
Priority 1
Priority 0
Disaster recover data center. Will never become primary automatically.
Even Better
Primary
Secondary
Secondary
San Francisco
Dallas
New York
FAST RECOVERY
2 Replicas + Arbiter??
Primary
Arbiter
Secondary Is this a good idea?
2 Replicas + Arbiter??
Primary
Arbiter
Secondary
1
2 Replicas + Arbiter??
Primary
Arbiter
Secondary
Primary
Arbiter
Secondary
1 2
2 Replicas + Arbiter??
Primary
Arbiter
Secondary
Primary
Arbiter
Secondary
1 2
Primary
Arbiter
Secondary
3
Secondary
Full Sync
Uh oh. Full Sync is going to use a lot of resources on the primary. So I may have downtime or degraded performance
With 3 replicas
Primary
Secondary
1
Secondary
With 3 replicas
Primary
Secondary
Primary
Secondary
1 2
Secondary Secondary
With 3 replicas
Primary
Secondary
Primary
Secondary
1 2
Primary
Secondary
3
Secondary
Full Sync
Sync can happen from secondary, which will not impact traffic on Primary.
Secondary Secondary Secondary
Tips for choosing replica set topology
• Avoid single points of failure – Separate racks– Separate data centers
• Avoid long recovery downtime– Use journaling – Use 3+ replicas
• Keep your actives close – Use priority to control where failovers
happen
Summary
• Sizing a machine – Know your working set size – Size RAM appropriately – Provision sufficient disks
• Designing a replica set – Know how failover happens – Design for failure – Design for fast recover