The Backup Methods Available for MongoDB...Backup importance for companies and backup plans....
Transcript of The Backup Methods Available for MongoDB...Backup importance for companies and backup plans....
The Backup Methods Available for MongoDB
Adamo Tonete
2
Agenda
Backup importance for companies and backup plans.
Available Methods:
- Disk Snapshot- mongodump- rsync or copy- Point in time backup from Percona- MongoDB Cloud / Ops Manager backup (on-prem)- Hot Backup
Q&A
Replica-set and Shard Concepts 101
4
Replicasets and Shard concepts
5
Replicaset and Shard concepts
Why is Backup Important?
7
Why is Backup Important?
Data usually is the most valuable asset in a company.
A company with severe data loss may not even come back to the business.
Could you imagine a bank losing all its data or an e-commerce offline for 1 week?
8
Data loss can occur in 3 main different situations:
1) Human Error2) DB failure/corruption3) System failure/collapse4) Security Breach
Why is Backup Important?
Backup Plan
10
Backup Plan
Choose the best RPO, RTO for your company.
- Recovery POINT Objective- Recovery Time Objective
Backup Plan/Disaster Recovery Plan
11
● RTO is how much time can the company would accept to be "offline".
● How long should take to have my application back online?
Why is Backup Important?
12
● RPO is what POINT in time must the backups be when we have a data loss/incident.
● This is an extreme important metric to know how often a backup need to be made.
Why is Backup Important?
13
Backup Plan/Disaster Recovery Plan
1TB replica-set
14
Backup Plan/Disaster Recovery Plan
RTO = 20 minutesRPO = 30 minutes
1TB replica-
set
15
Backup Plan/Disaster Recovery Plan
RTO = 20 minutesRPO = 30 minutes
1TB replica-
set
95% read
5% writes
16
Backup Plan/Disaster Recovery Plan
RTO = 20 minutesRPO = 30 minutes
1TB replica-
set
2000 inserts/day
3000 review day
17
Backup Plan/Disaster Recovery Plan
We have 1TB data and...
5 GB is for user login
2 GB day of new writes
~ 900 GB of reviews and 40GB is the favorites (90% of the traffic)
Favorites are updated every 20 minutes asynchronous.
18
Backup Plan/Disaster Recovery Plan
Login
Favorites Comment/upvoteHistorical data/non
fav
90% traffic - 10% data10% traffic - 90% data
19
Backup Plan/Disaster Recovery Plan
● Backup the user database every 30 minutes● Backup the favorite topics every 20 minutes (right after the sync)● Backup the new comments in an incremental way (using filter for
created_at > last backup)● Backup the history aged/non favorites collection once per day
20
Backup Plan/Disaster Recovery Plan
5 GB user - 30 minutes
40 GB favorites - 20 minutes
900 GB - non favorite data
Comments every hour - 500 MB
21
What feature should have priority in a recovery situation?
Backup Plan/Disaster Recovery Plan
22
Backup Plan/Disaster Recovery Plan
Login
Favorites Comment/upvote
90% traffic - 10% data
23
● With 10% of the data the environment is handling 90% of the requests and slowly recovering the old data.
● Not all the companies consider this as a full RTO but other do. It depends on the expectations.
Replica-sets and Shard concepts
Disk Snapshot
25
Disk snapshot is a full copy of the data currently in a disk.
The snapshot process may take a while but the advantage is when a restore is needed the files are already ready for the database.
No need to create indexes or run a file restore, the recover time is fast.
Disk Snapshot
26
Disk Snapshot
Advantages:
Straight forward approach, take a copy of what is in the disk and that’s all.
27
Disk Snapshot
Disadvantages
May slow down the database while the snapshot is being created.
Can take several hours depending on the disk speed
No "partial" restore all or nothing
28
Disk Snapshot
Backup type: Binary copy
Time to backup: High
Complexity: Low
Time to recover: Low
Rsync or scp to a different host
30
Rsync or SCP
● Consists in copying the entire/data folder to a different machine/disk while a mongod process is stopped or all the writes are stopped.
● It was very common in MMAP and still possible with wiredTiger.
31
Advantages
Data is ready to be used in the target folder.
Just start the mongod process using the backup folder.
Rsync or SCP
32
Rsync or SCP
Disadvantages
Needs to stop a secondary or lock writes. May affect performance.
Restore is all or nothing.
33
Rsync or SCP
Backup type: Binary
Time to backup: High
Complexity: Medium
Time to recover: Low
Mongodump
35
mongodump
mongodump in bounded with mongodb and it is the preferable tool to backup a mongodb database.
It is important to mention there are 2 steps to perform a disaster recover when using mongodump
1) create the dump file
2) restore the dump file with mongorestore
36
mongodump
Use mongodump to create backups per:
● Database● Collection● Specific value (query)● Point in time backup (when using replica-sets)
37
Although the mongodump tool is very versatile only having backup file doesn't mean you are safe.
dump files need to processed by mongorestore to rebuild the database. An error in the dump file may break the entire restore process.
mongodump
38
mongodump
Backup
files
Backup
files
dump process
39
mongodump
Backup
files
Backup
files
Collection Start Time End Time
users T T+10
logins T T+20
favorites T+10 T+30
other T+20 T+40
40
mongorestore
Backup
files
Backup
files
41
Mongodump
Backup
files
Backup
files
dump process
o
p
l
o
g
oplog
42
Mongodump
Backup
files
Backup
files
Collection Start Time End Time oplog
users T T+10 T+50
logins T T+20 T+40
favorites T+10 T+30 T+20
messages T+20 T+40 T+0Oplog
43
Mongodump
It is easy to achieve a point in time backup in a replica-set with mongodump. However the same is not true for sharding.
How to guarantee all the backups will end at the same time?
https://github.com/Percona-Lab/mongodb_consistent_backup
44
Mongodump + Percona Scripts
Percona POINT in time backup is a Beta tool from percona to backup a cluster wide project in a point in time way.
It does rely on mongodump and ensures all the dumps ends at the same time generating an point in time backup from a cluster.
Full backup, not partial
45
Mongodump + Percona Scripts
46
Advantages
Highly flexible tool to generate backups.
Default logical backup method offered by mongodb
Mongodump + Percona Scripts
47
Mongodump + Percona Scripts
Disadvantages
Default behavior is not point in time.
Restore time can take longer as indexes needs to be rebuilt.
Backup files needs to be tested
48
Mongodump + Percona Scripts
Backup type: logical
Time to backup: depends
Complexity: low to high
Time to recover: depends usually high
MongoDB Atlas
50
MongoDB Atlas
Fully managed backup service offered by MongoDB
It is possible to backup using cloud provider snapshot or continuous backup.
Only need an agent installed and all done. The configuration is done by a website. No tech skills need.
51
MongoDB Atlas
Backup type: logical/snapshots
Time to backup: low
Complexity: (unknown)
Time to recover: (unknown)
would say fast as the data is in the same DC
Percona Hot Backup
53
Binary lightweight backup method that copies the database to a different folder/disk without affecting the instance performance.
Available in WiredTiger only. Acts very similar to a disk snapshot but in the database level.
Generates a point of time copy of the database.
Percona Hot Backup
54
Percona Hot Backup
Backup type: logical
Time to backup: medium
Complexity: low
Time to recover: low
Questions
56
Rate My Session
57
Thank You Sponsors!!