Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k)...

Post on 30-Aug-2021

1 views 0 download

Transcript of Btrfs Specific Dedup Liu Bo · 2017. 12. 14. · default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k)...

Btrfs Specific Dedup

Liu Bo

Why btrfs needs dedup?

What is Dedup?

Dedup

• A specialized compression technique

• Elimate duplicate copies

• Improve storage utilization

But we already have

Compression?

A Good FS For Backup!

Btrfs:● Cow B+tree● 2^64 byte == 16 EiB maximum file size● Dynamic inode allocation● Checksum on both data and metadata● Compression(zlib, lzo supported)● Integrated multiple device support● Subvolume, writable/readonly snapshot● Send/receive● Etc

Btrfs Deduplication:● Inline● Bock level

Back Reference:

●Fingerprint●

●Hash algorithm:Crc32c vs sha256

●B+tree: dedup tree

●Keys: dedup keys

Dedup Engine:● Dedup is a filter of IO as

compression● Take a bunch of locked pages to

process● Asynchronous helper thread, aim to

work across all online processors

Flexible Control:● Register (create the dedup tree)● Unregister (delete the out-of-date

dedup tree)● Mount options

– "-o dedup"– "-o dedup_bs=xxx", eg. 4k, 128k

Conclusion:● Transparent dedup● Synchronous, block level● Compression support● Tunable granularity, ie. dedup

blocksize● Not default, easy to control

Limit:

● Effective on backup, virtualization● Ineffective on structured data

Performance

default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128)0

100

200

300

400

500

600

700

85.9

136163

195 199

88.7

155175

199

243

83.8

178

440

602

6481G Zero Write(compress: OFF)

First write

Backup-1

Backup-2

Performance, cont

default dedup(bs=4k) dedup(bs=8k) dedup(bs=64k) dedup(bs=128)0

100

200

300

400

500

600

700

800

900

323

136163

195 199

327

154175

198 202

843

155

207239

290

1G Zero Write(compress: ON)

First write

Backup-1

Backup-2

Demo

Known Issues:

● ENOSPC● A byte to byte comparison

QA

Reference● http://en.wikipedia.org/wiki/Data_deduplication

● - http://media.netapp.com/documents/tr-3505.pdf

● - http://www.druva.com/blog/2009/01/09/understanding-data-deduplication

● - https://btrfs.wiki.kernel.org/index.php/Main_Page

● - https://communities.netapp.com/community/netapp-blogs/drdedupe/blog/2010/04/07/how-netapp-deduplication-works--a-primer

● - http://en.wikipedia.org/wiki/Fingerprint_%28computing%29

Thank you!

Liu Bo

<bo.li.liu@oracle.com>