Webinar: Backups and Disaster Recovery

39
MongoDB Backups & Disaster Recovery Sandeep Parikh, Solutions Architect

description

In this webinar, we'll discuss the different ways to back up and restore your single servers, replica sets, and sharded clusters in case of a disaster scenario. We'll review various approaches, including taking filesystem snapshots, using mongodump and mongorestore, or leveraging MongoDB Management Service to backup and restore.

Transcript of Webinar: Backups and Disaster Recovery

Page 1: Webinar: Backups and Disaster Recovery

MongoDBBackups & Disaster Recovery

Sandeep Parikh, Solutions Architect

Page 2: Webinar: Backups and Disaster Recovery

2

• Curious about backups and disaster recovery

• Proactively planning for your deployment

• Currently facing an impending or ongoing crisis

What Brought You Here?

Page 3: Webinar: Backups and Disaster Recovery

3

• Discuss recovery vs. availability

• Review backup and restore tools– What is the relative complexity of each option?

• Understand high availability– What are you trying to protect against?

Goals

Page 4: Webinar: Backups and Disaster Recovery

Availability vs. Recovery

Page 5: Webinar: Backups and Disaster Recovery

Disasters do happen

Page 6: Webinar: Backups and Disaster Recovery

Sometimes they are our fault

Page 7: Webinar: Backups and Disaster Recovery

7

• Don’t confuse the two

• Distinctly different business requirements

• Technical solutions may converge

Availability vs. Recovery

Page 8: Webinar: Backups and Disaster Recovery

8

• Availability– Data is readable/writable in the face of infrastructure

failures– Tunable resiliency depending upon failure scenarios

• Recovery– Data is safe, nothing is lost– Returning from failure is a straightforward task

Definitions

Page 9: Webinar: Backups and Disaster Recovery

9

• How much data can you afford to lose?

• How long can you afford to be off-line?

• Cost is impacted by these decisions.

Considerations

Page 10: Webinar: Backups and Disaster Recovery

Recovery

Page 11: Webinar: Backups and Disaster Recovery

What’s the most important thing about creating backups?

Restoring

Page 12: Webinar: Backups and Disaster Recovery

If you can’t restore, you aren’t backed up

Page 13: Webinar: Backups and Disaster Recovery

13

• mongodump/mongorestore

• Storage-level options

• MongoDB Backup Service (new!)

Recovery Approaches

Page 14: Webinar: Backups and Disaster Recovery

14

mongodump / mongorestore

MongoDB BSON files

mongodump

mongorestore

Page 15: Webinar: Backups and Disaster Recovery

15

• Can be run in live or offline mode

• Oplog-aware for point-in-time operations

• Filter can be applied in both directions

• http://docs.mongodb.org/manual/reference/program/mongodump/

• http://docs.mongodb.org/manual/reference/program/mongorestore/

mongodump / mongorestore

Page 16: Webinar: Backups and Disaster Recovery

16

• Copy files in your data directory (e.g. /data/db)

• Filesystem or block storage snapshot

• Considerations– Journaling (on by default*)– Ensuring consistency

Storage-level Backups

Page 17: Webinar: Backups and Disaster Recovery

17

• Entire database is backed up– Backup files will be large

• Fastest way to create a backup

• Fastest way to restore a backup

• Ongoing management requires devops expertise

Storage-level Considerations

Page 18: Webinar: Backups and Disaster Recovery

18

MongoDB Management Service

Page 19: Webinar: Backups and Disaster Recovery

19

• Recovery at scale is challenging– Deployments are growing all the time

• MongoDB needed a simple, scalable approach to secure recovery

• Ongoing management of manual solutions was difficult

• Overhead needed to be minimal

MMS Backup: Background

Page 20: Webinar: Backups and Disaster Recovery

20

MMS Backup: Features

Available• Cloud-based service• Archived across DCs

Secure• Data is encrypted in-

transit• 2-factor auth for

restores

Managed• Developed and

monitored by MongoDB• Point-in-time backups

Overhead• Lightweight agent,

processes oplog

Restores• Free, unlimited• Seed new environments

Page 21: Webinar: Backups and Disaster Recovery

Unlimited restoresSeed dev, test, or new environments

Page 22: Webinar: Backups and Disaster Recovery

22

Recovery Approaches

Complexity

Scalability

mongodump

storage-level

mms backup

Page 23: Webinar: Backups and Disaster Recovery

Availability

Page 25: Webinar: Backups and Disaster Recovery

25

Replication

Page 26: Webinar: Backups and Disaster Recovery

26

Failure Tolerance

Node

Network

RackData Center

Global

Page 27: Webinar: Backups and Disaster Recovery

27

Node Failure

Primary

Secondary

Secondary

Page 28: Webinar: Backups and Disaster Recovery

28

Rack Failure

Rack 2

Rack 1Primary

Secondary

Secondary

Page 29: Webinar: Backups and Disaster Recovery

29

Data Center Failure

DC 2

DC 1Primary

Secondary

Secondary

Page 30: Webinar: Backups and Disaster Recovery

30

• Increased data redundancy

• Replica Set “majority” drives availability

• Deploy across multiple levels– Racks, Regions, Data Centers

• Can support recovery and availability requirements– Recovery: geographically dispersed copies of data– Availability: multi-level failover protection

Resilient Topology

Page 31: Webinar: Backups and Disaster Recovery

Recovery Examples

Page 32: Webinar: Backups and Disaster Recovery

32

Stop the balancer (wait)

(or use scheduled window)

Stop one config server (data is still

r/w)

Backup config server

Execute backup across all shards

Restart config server

Resume balancer

Backup: Sharded Cluster

Page 33: Webinar: Backups and Disaster Recovery

33

• Goals:– Return to normal operation/configuration, or– Amend configuration parameters?

• Normal operation:– Stop mongod and mongos processes– Restore each shard as a replica set restore– Restore config server data– Restart mongos and mongod

• Amended deployment (advanced!):– Shard key, shard topology, config hostnames

Restore: Sharded Cluster

Page 34: Webinar: Backups and Disaster Recovery

34

• mongodump/mongorestore– --oplog[Replay]– --objcheck/--repair– --dbpath– --query/--filter

• bsondump– inspect data at console

• lvm snapshot time/space trade-off– Multi EBS backup– clean up snapshots

Other Recovery Tools

Page 35: Webinar: Backups and Disaster Recovery

Summary

Page 36: Webinar: Backups and Disaster Recovery

36

• Pick the easiest solution and backup immediately– Then test the restore process

• Interim solutions are ok – data safety is paramount

• Iterate into a long-term scalable approach

• Happieness is at the intersection of availability and recovery

• Manage complexity and scalability

Work Towards a Balanced Solution

Page 37: Webinar: Backups and Disaster Recovery

Questions?

Page 38: Webinar: Backups and Disaster Recovery

38

• MMS Backup– https://mms.mongodb.com/backup/

• Replication:– http://www.mongodb.com/presentations/webinar-replicati

on-and-replica-sets-0

• Advanced Replication:– http://www.mongodb.com/presentations/advanced-replicat

ion-features

• Backup Strategies:– http://docs.mongodb.org/manual/core/backups/

Resources

Page 39: Webinar: Backups and Disaster Recovery