The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.
-
Upload
louisa-waters -
Category
Documents
-
view
216 -
download
0
Transcript of The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.
![Page 1: The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.](https://reader035.fdocuments.in/reader035/viewer/2022062409/5697bfca1a28abf838ca95a3/html5/thumbnails/1.jpg)
The Bloom Paradox
Ori Rottenstreich
Joint work with Isaac Keslassy
Technion, Israel
![Page 2: The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.](https://reader035.fdocuments.in/reader035/viewer/2022062409/5697bfca1a28abf838ca95a3/html5/thumbnails/2.jpg)
• Requirement: A data structure in user with fast answer to• Solutions:
o O(n) – Searching in a listo O(log(n)) – Searching in a sorted listo O(1) – But with false positives / negatives
Slocal cache
Problem Definition
2
Mcentral memory with
all elements
vuzyxzx
x
usercost = 10
cost = 1x
y
cost = 10
y
user
y
![Page 3: The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.](https://reader035.fdocuments.in/reader035/viewer/2022062409/5697bfca1a28abf838ca95a3/html5/thumbnails/3.jpg)
• False Positive: but the data structure answers
• Results in a redundant access to the local cache.
Additional cost of 1.
• False Negative: but the data structure answers
• Results in an expensive access to the central memory instead of the local cache.
Additional cost of 10-1=9.
Two Possible Errors
3
x
y
![Page 4: The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.](https://reader035.fdocuments.in/reader035/viewer/2022062409/5697bfca1a28abf838ca95a3/html5/thumbnails/4.jpg)
1
• Initialization: Array of zero bits.
• Insertion: Each of the elements is hashed times, the corresponding bits are set.
• Query: Hashing the element, checking that all bits are set.
• False positive rate (probability) of .
• No false negatives.
Bloom Filters (Bloom, 1970)
4
0000000000 00
1
y1 1
0000000000 00
1 1
z
x11
1 1
1 11 1 1
x11 1 w
1 11
![Page 5: The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.](https://reader035.fdocuments.in/reader035/viewer/2022062409/5697bfca1a28abf838ca95a3/html5/thumbnails/5.jpg)
• Cache/Memory Framework• Packet Classification• Intrusion Detection• Routing• Accounting• Beyond networking: Spell Checking, DNA Classification
• Can be found in o Google's web browser Chromeo Google's database system BigTableo Facebook's distributed storage system Cassandrao Mellanox's IB Switch System
Bloom Filters are Widely Used
5
![Page 6: The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.](https://reader035.fdocuments.in/reader035/viewer/2022062409/5697bfca1a28abf838ca95a3/html5/thumbnails/6.jpg)
The Bloom Paradox
6
Sometimes, it is better to disregard the Bloom filter results, and in fact not to even query it,
thus making the Bloom filter useless.
![Page 7: The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.](https://reader035.fdocuments.in/reader035/viewer/2022062409/5697bfca1a28abf838ca95a3/html5/thumbnails/7.jpg)
Outline
Introduction to Bloom Filters The Bloom Paradox
o The Bloom Paradox in Bloom Filterso Analysis of the Bloom Paradox o The Bloom Paradox in the Counting Bloom Filter
Summary
7
![Page 8: The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.](https://reader035.fdocuments.in/reader035/viewer/2022062409/5697bfca1a28abf838ca95a3/html5/thumbnails/8.jpg)
• Parameters:
• Extreme case without locality: All elements with equal probability of
belonging to the cache.o Toy example
Bloom Paradox Example
8
Bloom filter
![Page 9: The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.](https://reader035.fdocuments.in/reader035/viewer/2022062409/5697bfca1a28abf838ca95a3/html5/thumbnails/9.jpg)
• Parameters:• Let be the set of elements that the Bloom filter indicates are in
o In particular, no false negatives in Bloom filter
• Intuition:
Slocal cache
Mcentral memory with
all elements
vuzyxzx
cost = 10cost = 1
cost = 10
Bloom Paradox Example
. .
userBBloom filterBloom filter
9
![Page 10: The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.](https://reader035.fdocuments.in/reader035/viewer/2022062409/5697bfca1a28abf838ca95a3/html5/thumbnails/10.jpg)
• Parameters:• Let be the set of elements that the Bloom filter indicates are in
o In particular, no false negatives in Bloom filter
• Surprise:
cost = 1
Slocal cache
Mcentral memory with
all elements
vuzyxzx
cost = 10
cost = 10
Bloom Paradox Example
. . 9
BBloom filter
![Page 11: The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.](https://reader035.fdocuments.in/reader035/viewer/2022062409/5697bfca1a28abf838ca95a3/html5/thumbnails/11.jpg)
• Parameters:• Let be the set of elements that the Bloom filter indicates are in
o In particular, no false negatives in Bloom filter
• Surprise:
The Bloom filter indicates the membership of
elements. Only of them are indeed in .
Bloom Paradox Example
. .
BBloom filter
![Page 12: The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.](https://reader035.fdocuments.in/reader035/viewer/2022062409/5697bfca1a28abf838ca95a3/html5/thumbnails/12.jpg)
• When the Bloom filter states that , it is wrong with probability
• Average cost if we listen to the Bloom filter:
• Average cost if we don’t:
The Bloom filter is useless!
Bloom Paradox Example
11
Don’t listen to the Bloom filter
= =
![Page 13: The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.](https://reader035.fdocuments.in/reader035/viewer/2022062409/5697bfca1a28abf838ca95a3/html5/thumbnails/13.jpg)
Outline
Introduction to Bloom Filters The Bloom Paradox
o The Bloom Paradox in Bloom Filterso Analysis of the Bloom Paradox o The Bloom Paradox in the Counting Bloom Filter
Summary
12
![Page 14: The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.](https://reader035.fdocuments.in/reader035/viewer/2022062409/5697bfca1a28abf838ca95a3/html5/thumbnails/14.jpg)
• The cost of a false positive : 1• The cost of a false negative :
• In the cache example:
Costs of the Two Possible Errors
13
![Page 15: The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.](https://reader035.fdocuments.in/reader035/viewer/2022062409/5697bfca1a28abf838ca95a3/html5/thumbnails/15.jpg)
• Let be the a priori membership probability of o i.e. before getting the answer of the Bloom filter
• Intuition: The Bloom paradox occurs more often when:o is small
Conditions for the Bloom Paradox
14
localcache
Bloom filter
central memory
![Page 16: The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.](https://reader035.fdocuments.in/reader035/viewer/2022062409/5697bfca1a28abf838ca95a3/html5/thumbnails/16.jpg)
• Let be the a priori membership probability of o i.e. before getting the answer of the Bloom filter
• Intuition: The Bloom paradox occurs more often when:o is smallo is large (i.e. is small)
Conditions for the Bloom Paradox
14central memory
localcache
Bloom filter
![Page 17: The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.](https://reader035.fdocuments.in/reader035/viewer/2022062409/5697bfca1a28abf838ca95a3/html5/thumbnails/17.jpg)
• Let be the a priori membership probability of o i.e. before getting the answer of the Bloom filter
• Intuition: The Bloom paradox occurs more often when:o is small o is large (i.e. is small)o is small (because the Bloom filter implicitly assumes )
Conditions for the Bloom Paradox
14
Bloom filtercentral memory
localcache
![Page 18: The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.](https://reader035.fdocuments.in/reader035/viewer/2022062409/5697bfca1a28abf838ca95a3/html5/thumbnails/18.jpg)
• Let be the a priori membership probability of o i.e. before getting the answer of the Bloom filter
• Intuition: The Bloom paradox occurs more often when:o is small o is large (i.e. is small)o is small (because the Bloom filter implicitly assumes )
• Theorem 1:The Bloom paradox occurs if and only if
• Boundaries of the Bloom Paradox: (for )
Conditions for the Bloom Paradox
14
If and the Bloom paradox occurs if
![Page 19: The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.](https://reader035.fdocuments.in/reader035/viewer/2022062409/5697bfca1a28abf838ca95a3/html5/thumbnails/19.jpg)
• Theorem 1:The Bloom paradox occurs if and only if
Bloom Filter Improvements
15
• Use the formula to improve the Bloom filter o Only insert / query Bloom filter if the formula expects it to be
useful
Bloom filtercentral memory
localcache
![Page 20: The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.](https://reader035.fdocuments.in/reader035/viewer/2022062409/5697bfca1a28abf838ca95a3/html5/thumbnails/20.jpg)
• Theorem 1:The Bloom paradox occurs if and only if
Bloom Filter Improvements
15
• Use the formula to improve the Bloom filter o Only insert / query Bloom filter if the formula expects it to be
useful
Bloom filtercentral memory
localcache
![Page 21: The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.](https://reader035.fdocuments.in/reader035/viewer/2022062409/5697bfca1a28abf838ca95a3/html5/thumbnails/21.jpg)
Outline
Introduction to Bloom Filters The Bloom Paradox
o The Bloom Paradox in Bloom Filterso Analysis of the Bloom Paradox o The Bloom Paradox in the Counting Bloom Filter
Summary
16
![Page 22: The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.](https://reader035.fdocuments.in/reader035/viewer/2022062409/5697bfca1a28abf838ca95a3/html5/thumbnails/22.jpg)
1
• Bloom filters do not support deletions of elements. Simply resetting bits might cause false negatives.
• The solution: Counting Bloom filters - Storing array of counters instead of bits.o Insertion: Incrementing counters by one.o Deletion: Decrementing counters by one. o Query: Checking that counters are positive.
• The same false positive probability.• Require too much memory, e.g. 57 bits per element for .
Counting Bloom Filters (CBFs)
y+1 +1
0102001010 01
+1 +1x
+1+1
0000001010 00
x11 111
![Page 23: The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.](https://reader035.fdocuments.in/reader035/viewer/2022062409/5697bfca1a28abf838ca95a3/html5/thumbnails/23.jpg)
• Queryo Checking that counters are positive.
o Question: Which is more likely to be correct? y or z?
Counting Bloom Filter Query
18
0381052010 12
zy
y
![Page 24: The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.](https://reader035.fdocuments.in/reader035/viewer/2022062409/5697bfca1a28abf838ca95a3/html5/thumbnails/24.jpg)
• Theorem 2:Let denote the values of the counters pointed by the
set of hash functions. Then,
19
The Bloom Paradox in the Counting Bloom Filter
Only counters product matters!
![Page 25: The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.](https://reader035.fdocuments.in/reader035/viewer/2022062409/5697bfca1a28abf838ca95a3/html5/thumbnails/25.jpg)
• Parameters: n=3328, m = 28485, k=6 20
CBF Based Membership Probability
-Before checking CBF, a priori membership probability = ≈ 0.03-CBF indicates counters product=8 a posteriori membership probability ≈ 0.69
![Page 26: The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.](https://reader035.fdocuments.in/reader035/viewer/2022062409/5697bfca1a28abf838ca95a3/html5/thumbnails/26.jpg)
• Internet trace (equinix-chicago) with real hash functions.
Counting Bloom filter parameters: n=210, m / n = 30, k=5, 220
queries
21
Experimental Results
![Page 27: The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.](https://reader035.fdocuments.in/reader035/viewer/2022062409/5697bfca1a28abf838ca95a3/html5/thumbnails/27.jpg)
• Discovery of the Bloom paradox
• Importance of the a priori membership probability
• Using the counters product to estimate the correctness of a positive indication of the CBF
Concluding Remarks
22
![Page 28: The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.](https://reader035.fdocuments.in/reader035/viewer/2022062409/5697bfca1a28abf838ca95a3/html5/thumbnails/28.jpg)
Thank You