Webinar: Backups and Disaster Recovery

Post on 10-May-2015

2.680 views 1 download

Tags:

description

In this webinar, we'll discuss the different ways to back up and restore your single servers, replica sets, and sharded clusters in case of a disaster scenario. We'll review various approaches, including taking filesystem snapshots, using mongodump and mongorestore, or leveraging MongoDB Management Service to backup and restore.

Transcript of Webinar: Backups and Disaster Recovery

MongoDBBackups & Disaster Recovery

Sandeep Parikh, Solutions Architect

2

• Curious about backups and disaster recovery

• Proactively planning for your deployment

• Currently facing an impending or ongoing crisis

What Brought You Here?

3

• Discuss recovery vs. availability

• Review backup and restore tools– What is the relative complexity of each option?

• Understand high availability– What are you trying to protect against?

Goals

Availability vs. Recovery

Disasters do happen

Sometimes they are our fault

7

• Don’t confuse the two

• Distinctly different business requirements

• Technical solutions may converge

Availability vs. Recovery

8

• Availability– Data is readable/writable in the face of infrastructure

failures– Tunable resiliency depending upon failure scenarios

• Recovery– Data is safe, nothing is lost– Returning from failure is a straightforward task

Definitions

9

• How much data can you afford to lose?

• How long can you afford to be off-line?

• Cost is impacted by these decisions.

Considerations

Recovery

What’s the most important thing about creating backups?

Restoring

If you can’t restore, you aren’t backed up

13

• mongodump/mongorestore

• Storage-level options

• MongoDB Backup Service (new!)

Recovery Approaches

14

mongodump / mongorestore

MongoDB BSON files

mongodump

mongorestore

15

• Can be run in live or offline mode

• Oplog-aware for point-in-time operations

• Filter can be applied in both directions

• http://docs.mongodb.org/manual/reference/program/mongodump/

• http://docs.mongodb.org/manual/reference/program/mongorestore/

mongodump / mongorestore

16

• Copy files in your data directory (e.g. /data/db)

• Filesystem or block storage snapshot

• Considerations– Journaling (on by default*)– Ensuring consistency

Storage-level Backups

17

• Entire database is backed up– Backup files will be large

• Fastest way to create a backup

• Fastest way to restore a backup

• Ongoing management requires devops expertise

Storage-level Considerations

18

MongoDB Management Service

19

• Recovery at scale is challenging– Deployments are growing all the time

• MongoDB needed a simple, scalable approach to secure recovery

• Ongoing management of manual solutions was difficult

• Overhead needed to be minimal

MMS Backup: Background

20

MMS Backup: Features

Available• Cloud-based service• Archived across DCs

Secure• Data is encrypted in-

transit• 2-factor auth for

restores

Managed• Developed and

monitored by MongoDB• Point-in-time backups

Overhead• Lightweight agent,

processes oplog

Restores• Free, unlimited• Seed new environments

Unlimited restoresSeed dev, test, or new environments

22

Recovery Approaches

Complexity

Scalability

mongodump

storage-level

mms backup

Availability

25

Replication

26

Failure Tolerance

Node

Network

RackData Center

Global

27

Node Failure

Primary

Secondary

Secondary

28

Rack Failure

Rack 2

Rack 1Primary

Secondary

Secondary

29

Data Center Failure

DC 2

DC 1Primary

Secondary

Secondary

30

• Increased data redundancy

• Replica Set “majority” drives availability

• Deploy across multiple levels– Racks, Regions, Data Centers

• Can support recovery and availability requirements– Recovery: geographically dispersed copies of data– Availability: multi-level failover protection

Resilient Topology

Recovery Examples

32

Stop the balancer (wait)

(or use scheduled window)

Stop one config server (data is still

r/w)

Backup config server

Execute backup across all shards

Restart config server

Resume balancer

Backup: Sharded Cluster

33

• Goals:– Return to normal operation/configuration, or– Amend configuration parameters?

• Normal operation:– Stop mongod and mongos processes– Restore each shard as a replica set restore– Restore config server data– Restart mongos and mongod

• Amended deployment (advanced!):– Shard key, shard topology, config hostnames

Restore: Sharded Cluster

34

• mongodump/mongorestore– --oplog[Replay]– --objcheck/--repair– --dbpath– --query/--filter

• bsondump– inspect data at console

• lvm snapshot time/space trade-off– Multi EBS backup– clean up snapshots

Other Recovery Tools

Summary

36

• Pick the easiest solution and backup immediately– Then test the restore process

• Interim solutions are ok – data safety is paramount

• Iterate into a long-term scalable approach

• Happieness is at the intersection of availability and recovery

• Manage complexity and scalability

Work Towards a Balanced Solution

Questions?

38

• MMS Backup– https://mms.mongodb.com/backup/

• Replication:– http://www.mongodb.com/presentations/webinar-replicati

on-and-replica-sets-0

• Advanced Replication:– http://www.mongodb.com/presentations/advanced-replicat

ion-features

• Backup Strategies:– http://docs.mongodb.org/manual/core/backups/

Resources