Deduplicationjannen/teaching/s19/cs333/meetings/...Geoff Kuenning Gala Yadgar Sources of Duplicates...

Deduplication

CSCI 333Spring 2019

Logistics

• Lab 2a/b• Final Project• Final Exam• Grades

Last Class

• BetrFS [FAST ‘15]

– Linux file system using Be-trees• Metadata Be-tree: path -> struct stat• Data in Be-tree: path|{block#} -> 4KiB block

– Schema maps VFS operations to efficient Be-tree operations• Upserts, Range queries

– Next iteration [FAST ‘16] : fixed slowest operations• Rangecast delete messages• “Zones”• Late-binding journal

This Class

• Introduction to Deduplication– Big picture idea– Design choices and tradeoffs– Open questions

• Slides from Gala Yadgar & Geoff Kuenning, presented at Dagstuhl

• I’ve added new slides (slides without borders) for extra context

Deduplication

Geoff Kuenning Gala Yadgar

Sources of Duplicates• Different people store the same files– Shared documents, code development– Popular photos, videos, etc.

• May also share blocks– Attachments– Configuration files– Company logo and other headers

à Deduplication!6

Deduplication

• Dedup(e) is one form of compression• High-level goal: identify duplicate objects and

eliminate redundant copies– How should we define a duplicate object?– What makes a copy “redundant”?

• Answers are application-dependent and some of the more interesting research questions!

857 Desktops at Microsoft

D. Meyer, W. Bolosky. A Study of Practical Deduplication. FAST 20118

“Naïve” DeduplicationFor each new file

Compare each block to all existing blocksIf new, write block and add pointerIf duplicate, add pointer to existing copy

File1 File3File2

Are we done?

Identifying Duplicates

• It’s unreasonable to “Compare each block to all existing blocks”

àFingerprintsCryptographic hash of block contentLow collision probability

Dedup Fingerprints• Goal: uniquely identify an object’s contents• How big should a fingerprint be?– Ideally, large enough that the probability of a collision is

lower than the probability of a hardware error• MD5: 16-byte hash• SHA-1: 20-byte hash

• Technique: system stores a map (index) between each object’s fingerprint and each object’s location– Compare a new object’s fingerprint against all existing

fingerprints, looking for a match– Scales with number of unique objects, not size of objects

Identifying Duplicates

• It’s unreasonable to “Compare each block to all existing blocks”

àFingerprintsCryptographic hash of block contentLow collision probability

• It’s also unreasonable to compare to all fingerprints…àFingerprint cache

Fingerprint Lookup• How should we store the fingerprints?• Every unique block is a miss à miss rate ≥ 40%• One solution: Bloom filter

• Challenge: 2% false positive rate à 1TB for 4PB of data13

Insert Insert

Lookup(negative)

Lookup(false positive)

lookup

How To Implement a Cache?

• (Bloom) Filters help us determine if a fingerprint exists– We still need to do an I/O to find the mapping

• Locality in fingerprints?– If we sort our index by fingerprint: cryptographic

hash destroys all notions of locality– What if we grouped fingerprints by temporal

locality of writes?

Reading and Restoring

• How long does it take to read File1?• How long does it take to read File3?

• Challenge: when is it better to store the duplicates?

File1 File3File2

Write Path

File recipe

Fingerprintindex

Chunk store

lookupSurprise

Many writes become faster!

Read Path

File recipe

Fingerprintindex

Chunk store

lookup

Delete Path

File recipe

Fingerprintindex

Chunk store

lookup

Reference counters: 1 2 1 2 11 2

• Challenge: storing reference counts– Physically separate from the chunks

Chunking

• Chunking: splitting files into blocks• Fixed-size chunks: usually aligned to device blocks • What is the best chunk size?

File1 File2

Updates and Versions

• Best case:aabbccdd

àaAbbccdd• Worst case:

aabbccddàaAabbccdd

File1 File1a

File1b

Ideally…

File1b

à aAa010bb010cc010dd

Variable-Size Chunks

• Basic idea: chunk boundary is triggered by a random string

• For example: 010• aa010bb010cc010dd

• Triggers should be:– Not too short/long– Not too popular (000000…)– Easy to identify

Identifying Chunk Boundaries

• 48-byte triggers (empirically, this works)

• Define a set of possible triggersàK highest bits of the hash are == 0

àRabin fingerprints do this efficiently

à “systems” solutions for corner cases

• Challenge: parallelize this process

…010110010011001110100100100110011001001001100110000…

00100010010000000101

Fingerprint

Boundary!K=5

Rabin Fingerprints

• “The polynomial representation of the data modulo a predetermined irreducible polynomial” [LBFS sosp01]

• What/why Rabin fingerprints?– Calculates a rolling hash– “Slide the window” in a constant number of

operations (intuition: we “add” a new byte and “subtract” an old byte to slide the window by one)

– Define a “chunk” once our window’s hash matches our target value (i.e., we hit a trigger)

Defining chunk boundaries• Tradeoff between small and large chunks?– Finer granularity of sharing vs. metadata overhead

• With process just described, how might we:– Produce a very small chunk?– Produce a very large chunk?

• How might we modify our chunking algorithm to give us “reasonable” chunk sizes?– To avoid small chunks: don’t consider boundaries until

minimum size threshold– To avoid large chunks: as soon as we reach a

maximum threshold, insert a chunk boundary

Distributed Storage Increase storage capacity and performance withmultiple storage servers• Each server is a separate machine

(CPU,RAM,HDD/SSD)• Data access is distributed between serversG Scalability

Increase capacity with data growthG Load balancing

Independent of workload G Failure handling

Network, nodes and devices always fail25

Distributed Deduplication

• Where/when should we look for duplicates?• Where should we store each file?

File1 File3File2

Challenges (aka Summary)

à Wonderful theory problems!

Approximate membership query structures (AMQ)

…010110010011001110100100100110011001001001100110000…Parallelizing chunking

Size of fingerprintdictionary

1 2 1 2 11 2

Bidirectional indexing of chunks

Next Class?

• Specific dedup system(s) (4)• Mapreduce (+ write-optimized) (2)• Google file system (1)• RAID (3)

Final Project Discussion

• Get with your group• Find another group• Pitch your project / show them your proposal– React/revise

Deduplicationjannen/teaching/s19/cs333/meetings/...Geoff Kuenning Gala Yadgar Sources of Duplicates...

Documents

Transcript of Deduplicationjannen/teaching/s19/cs333/meetings/...Geoff Kuenning Gala Yadgar Sources of Duplicates...

1 Query Containment for Conjunctive Queries With Regular Expressions Daniela Florescu, Alon Levy, Dan Suciu. PODS 1998 Slides by Gala Yadgar.

CS333 Intro to Operating Systems Jonathan Walpole.

CS 333 Introduction to Operating Systems Class 1 ...web.cecs.pdx.edu/~walpole/class/cs333/spring2007/slides/1.pdf · the execution of application programs and implements an interface

Classical IPC problems - web.cecs.pdx.eduweb.cecs.pdx.edu/~walpole/class/cs333/fall2005/slides/5.pdf · 2 Classical IPC problems Producer Consumer (bounded buffer) Dining philosophers

How To Save Money On Your Wedding - Roey Yadgar Films

CS for Scientists and Engineersmuller/teaching/cs101/s13/dist/docs/HMCS5book.pdfCS for Scientists and Engineers Christine Alvarado Zachary Dodds Geo Kuenning Ran Libeskind-Hadas q

WOWCS scribe notes - USENIXWOWCS scribe notes Colin Dixon (University of Washington, ckd@cs.washington.edu) Fred Douglis (IBM Research, fdouglis@us.ibm.com) Geoff Kuenning (Harvey

GoSeed: Generating an Optimal Seeding Plan for Deduplicated … · 2020. 2. 25. · Aviv Nachman, Gala Yadgar Computer Science Department, Technion Sarai Sheinvald Braude College

Filters (Bloom & Quotient) - Computer Sciencejannen/teaching/s20/cs333/meetings/Filters-sl… · Bloom Filters Goal: approximately represent aset of nelements using a bit array •Returnseither:

Cooperative Caching with Return on Investment · 2013. 5. 10. · [Yadgar et al. 2008] – Block accesses hinted or derived – Server constructs configuration • Very selfish clients

The History of Computing: The Early Days - Techniongotsman/236801/1-pre-digital.pdf · The History of Computing: The Early Days Avi Yadgar Gala Yadgar 2 Outline Memory aids Mechanical

Conservation Agriculture and Climate ResilienceConservation Agriculture and Climate Resilience Je rey D. Michler a, Kathy Baylis b, Mary Arends-Kuenning , and Kizito Mazvimavic aUniversity

Conquest: Preparing for Life After Disks An-I Andy Wang Geoff Kuenning, Peter Reiher, Gerald Popek.

Populist rebellion against modernity in 21st-century Eastern … · 2020. 9. 19. · Yadgar criticises as false the dichotomy between tradition and modernity, and the antinomy between

Adi Bar-Dagan Meyrav Elad Emanuel Hachamov Achinoam Levy Nocham Ohana Maxim Vainstein Kfir Yadgar Roy Zuckerman.

Yadgar Marsia Meer Anees

Energy Tracking & Storage for an Autonomous System OASIS OASIS Mir Ziyad Ali Liron Kopinsky Christopher Wallace Sarah Whildin Joseph Yadgar.

Hard Disk Drivesjannen/teaching/s20/cs333/... · Hard Disk Drives (HDDs) • High capacity, low cost • Predictable performance • “Unwritten contract”: Tracks (LBAs) near each

Conservation Agriculture and Climate Resilience · 13/11/2018 · Conservation Agriculture and Climate Resilience Je rey D. Michler a, Kathy Baylis b, Mary Arends-Kuenning , and

Asiya bibi kay khilaf yadgar faisala by justice muhammad naveed iqbal