Data De Duplication New
-
Upload
rashmipatel -
Category
Documents
-
view
335 -
download
1
Transcript of Data De Duplication New
Digital Information Growth
• IDC predicts that the growth of information will increase ten times in volume between 2006 and 2011
• Protecting the Critical data is a challenge for all organizations
Traditional Backup
• Backs up same file repeatedly (full and incremental backup)
• Expands storage from 5 to 30 times more
• Risk of shipping physical tapes
• Cost of Bandwidth, Storage and Time increases significantly
Challenge towards Backup
• Protect the unique data from the Backup
• Save only new or unique data from the data set
• Reconstitute all content in its original form on demand
Data De-Duplication is the solution
• Discovers and removes Redundant data from the Data set
• It reconstitute all content in its original form with 100% reliability at disk speeds
• Economize the storage and DR requirements for data
• Create even more Recovery Points for "roll back" to earlier versions of files and system configurations
Traditional Backup vs. De-duplication
Misconception between the Three
Generalized De-duplication
Where Data De-Duplication is done?
• Source side (client side)
• Target side (Server side)
Client Server
Source side Target side
LAN
Source Based de-duplication
• Eliminates redundant data at source
• De-duplication is performed at the start of backup process
• Needs different backup software at client and Target
• Reduces Storage requirement and network bandwidth
Target Based De-duplication
• Happens at backup storage device
• Initially saves all backup images to the backup appliance
• No need to change client’s backup software
• Does not reduces the bandwidth
Types of data de-duplication
• File level de-duplication
• Block(sub-file) level de-duplication– Fixed block level
– Variable block level
File level de-duplication
• Each file is treated as a single chunk
• No detection of duplicate data at sub-file level
• Small change in a file leads to store two separate copies of slightly different files.
Block level – Fixed length
– Arbitrary fixed length of data to search for duplicate data within files
– Miss to detect redundant sub-file data
Ex. Addition of a person’s name to title of the document shifts the whole content causing failure of the de-duplication tool to detect equivalencies
Block level- Variable length
– Not locked to any arbitrary length segments– Catch all duplicate segments in the
document, no matter where changes occur
DD makes replication affordable
Continued…
•The reduced backup image after de- duplication directly reduces the amount of storage needed at secondary site
•Allows to achieve disaster recovery more economically
•Backup de-duplication lowers infrastructure, time, operational overhead and bandwidth cost
Benefits
• Reduces Storage requirements
• Reduces the amount of energy needed to power and cool the storage array
• Reduces network bandwidth
• Consumes less time to replicate the backup
• Longer retention period of disks
• Achieves disaster recovery economically