Data De Duplication New

19
Data De- Duplication Technology Presented By, Rashmi V

Transcript of Data De Duplication New

Page 1: Data De Duplication New

Data De-DuplicationTechnology

Presented By, Rashmi V

[email protected]

Page 2: Data De Duplication New

Digital Information Growth

• IDC predicts that the growth of information will increase ten times in volume between 2006 and 2011

• Protecting the Critical data is a challenge for all organizations

Page 3: Data De Duplication New

Traditional Backup

• Backs up same file repeatedly (full and incremental backup)

• Expands storage from 5 to 30 times more

• Risk of shipping physical tapes

• Cost of Bandwidth, Storage and Time increases significantly

Page 4: Data De Duplication New

Challenge towards Backup

• Protect the unique data from the Backup

• Save only new or unique data from the data set

• Reconstitute all content in its original form on demand

Page 5: Data De Duplication New

Data De-Duplication is the solution

• Discovers and removes Redundant data from the Data set

• It reconstitute all content in its original form with 100% reliability at disk speeds

• Economize the storage and DR requirements for data

• Create even more Recovery Points for "roll back" to earlier versions of files and system configurations

Page 6: Data De Duplication New

Traditional Backup vs. De-duplication

Page 7: Data De Duplication New

Misconception between the Three

Page 8: Data De Duplication New

Generalized De-duplication

Page 9: Data De Duplication New

Where Data De-Duplication is done?

• Source side (client side)

• Target side (Server side)

Client Server

Source side Target side

LAN

Page 10: Data De Duplication New

Source Based de-duplication

• Eliminates redundant data at source

• De-duplication is performed at the start of backup process

• Needs different backup software at client and Target

• Reduces Storage requirement and network bandwidth

Page 11: Data De Duplication New

Target Based De-duplication

• Happens at backup storage device

• Initially saves all backup images to the backup appliance

• No need to change client’s backup software

• Does not reduces the bandwidth

Page 12: Data De Duplication New

Types of data de-duplication

• File level de-duplication

• Block(sub-file) level de-duplication– Fixed block level

– Variable block level

Page 13: Data De Duplication New

File level de-duplication

• Each file is treated as a single chunk

• No detection of duplicate data at sub-file level

• Small change in a file leads to store two separate copies of slightly different files.

Page 14: Data De Duplication New

Block level – Fixed length

– Arbitrary fixed length of data to search for duplicate data within files

– Miss to detect redundant sub-file data

Ex. Addition of a person’s name to title of the document shifts the whole content causing failure of the de-duplication tool to detect equivalencies

Page 15: Data De Duplication New

Block level- Variable length

– Not locked to any arbitrary length segments– Catch all duplicate segments in the

document, no matter where changes occur

Page 16: Data De Duplication New

DD makes replication affordable

Page 17: Data De Duplication New

Continued…

•The reduced backup image after de- duplication directly reduces the amount of storage needed at secondary site

•Allows to achieve disaster recovery more economically

•Backup de-duplication lowers infrastructure, time, operational overhead and bandwidth cost

Page 18: Data De Duplication New

Benefits

• Reduces Storage requirements

• Reduces the amount of energy needed to power and cool the storage array

• Reduces network bandwidth

• Consumes less time to replicate the backup

• Longer retention period of disks

• Achieves disaster recovery economically

Page 19: Data De Duplication New