SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF...
Transcript of SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF...
![Page 1: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/1.jpg)
SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT TRIES
Huanchen ZhangHyeontaek Lim, Viktor Leis, David G. Andersen
Michael Kaminsky, Kimberly Keeton, Andrew Pavlo
![Page 2: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/2.jpg)
Filters answer approximate membership queries
2
![Page 3: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/3.jpg)
Filters answer approximate membership queries
Billionaire
2
![Page 4: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/4.jpg)
Filters answer approximate membership queries
Billionaire
2
![Page 5: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/5.jpg)
Filters answer approximate membership queries
Billionaire
2
YES, 100% No False Negatives
![Page 6: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/6.jpg)
Filters answer approximate membership queries
Billionaire
2
![Page 7: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/7.jpg)
Filters answer approximate membership queries
Billionaire
2
NO, 99%
![Page 8: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/8.jpg)
Filters answer approximate membership queries
Billionaire
2
NO, 99%
YES, 1%
![Page 9: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/9.jpg)
Filters answer approximate membership queries
Billionaire
2
NO, 99%
YES, 1%
![Page 10: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/10.jpg)
Filters answer approximate membership queries
Billionaire
2
NO, 99%
YES, 1% False Positive Rate
![Page 11: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/11.jpg)
3
Local Memory
Slow Devices
Queries
Filters pre-reject most negative queries
![Page 12: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/12.jpg)
3
Local Memory
Slow Devices
Queries
Filters pre-reject most negative queries
NOProbably YES
![Page 13: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/13.jpg)
Existing filters only support point filtering
Point Filtering
4
Bloom Filter (1970)
Quotient Filter (2012)
Cuckoo Filter (2014)
SELECT * FROM BillionairesWHERE LastName = ‘Pavlo’
![Page 14: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/14.jpg)
Existing filters only support point filtering
Point Filtering
4
Bloom Filter (1970)
Quotient Filter (2012)
Cuckoo Filter (2014)
SELECT * FROM BillionairesWHERE LastName = ‘Pavlo’
Range Filtering
SELECT * FROM BillionairesWHERE LastName LIKE ‘Pav%’
![Page 15: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/15.jpg)
Existing filters only support point filtering
Point Filtering
4
Bloom Filter (1970)
Quotient Filter (2012)
Cuckoo Filter (2014)
SELECT * FROM BillionairesWHERE LastName = ‘Pavlo’
Range Filtering
SELECT * FROM BillionairesWHERE LastName LIKE ‘Pav%’
![Page 16: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/16.jpg)
Our solution: Succinct Range Filters (SuRF)
First practical, general-purpose range filter
5
SMALL: close to theoretic minimum
FAST: comparable to fastest trees
USEFUL: evaluated in RocksDB
64-bit integer keys, 1% false positive rate: ≈12 bits per key
10 million 64-bit integer keys: ≈ 200 ns per query
speed up range queries by up to 5x
![Page 17: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/17.jpg)
Starting point: a complete trie
S
6
I
G
M
O
D
K
D
D
O
P
S
![Page 18: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/18.jpg)
Starting point: a complete trie
S
6
I
G
M
O
D
K
D
D
O
P
S
TOO BIG
![Page 19: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/19.jpg)
S
7
I
G
M
O
D
K
D
D
O
P
S
S
I
G
MK O
Make it smaller: a truncated trie
![Page 20: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/20.jpg)
S
7
I
G
M
O
D
K
D
D
O
P
S
S
I
G
MK O
Make it smaller: a truncated trie
SIGMODSIGMETRICS
![Page 21: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/21.jpg)
8
Use suffix bits to reduce false positive rate
S
I
G
MK O
0x200xC8 0x06
Hashed Suffix Bits Real Suffix Bits
S
I
G
MK O
OD P
![Page 22: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/22.jpg)
8
Use suffix bits to reduce false positive rate
S
I
G
MK O
0x200xC8 0x06
Hashed Suffix Bits Real Suffix Bits
S
I
G
MK O
OD P
SIGMETRICS
![Page 23: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/23.jpg)
8
Use suffix bits to reduce false positive rate
S
I
G
MK O
0x200xC8 0x06
Hashed Suffix Bits Real Suffix Bits
S
I
G
MK O
OD P
SIGMETRICS
0x18
![Page 24: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/24.jpg)
8
Use suffix bits to reduce false positive rate
S
I
G
MK O
0x200xC8 0x06
Hashed Suffix Bits Real Suffix Bits
S
I
G
MK O
OD P
SIGMETRICS
0x18
SIGMETRICS
E
![Page 25: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/25.jpg)
8
Use suffix bits to reduce false positive rate
S
I
G
MK O
0x200xC8 0x06
Hashed Suffix Bits Real Suffix Bits
S
I
G
MK O
OD P
![Page 26: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/26.jpg)
8
Use suffix bits to reduce false positive rate
S
I
G
MK O
0x200xC8 0x06
Hashed Suffix Bits Real Suffix Bits
S
I
G
MK O
OD P
Each bit reduces FPR by half
![Page 27: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/27.jpg)
8
Use suffix bits to reduce false positive rate
S
I
G
MK O
0x200xC8 0x06
Hashed Suffix Bits Real Suffix Bits
S
I
G
MK O
OD P
Each bit reduces FPR by half
Cannot help range queries
![Page 28: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/28.jpg)
8
Use suffix bits to reduce false positive rate
S
I
G
MK O
0x200xC8 0x06
Hashed Suffix Bits Real Suffix Bits
S
I
G
MK O
OD P
Each bit reduces FPR by half
Cannot help range queries
Benefit point & range queries
![Page 29: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/29.jpg)
8
Use suffix bits to reduce false positive rate
S
I
G
MK O
0x200xC8 0x06
Hashed Suffix Bits Real Suffix Bits
S
I
G
MK O
OD P
Each bit reduces FPR by half
Cannot help range queries
Benefit point & range queries
Weaker distinguishability
![Page 30: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/30.jpg)
Succinct Data Structure
9
… uses an amount of space that is “close” to the information-theoretic lower bound, but still allows efficient query operations. [wikipedia]
![Page 31: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/31.jpg)
SuRF’s encoding is small and fast
10
Small≈10 + suffix bits per key for 64-bit integers
≈14 + suffix bits per key for emails
FastMatches state-of-the-art pointer-based trees
![Page 32: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/32.jpg)
LN-2
LN-1
LN
…, 6, 20, …
…, 12, 21, …
…, 11, 19, …
11
Bloom filters speed up point queries in RocksDB
B
B
B
Cached FiltersB, B, B, …
SSTable
![Page 33: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/33.jpg)
LN-2
LN-1
LN
…, 6, 20, …
…, 12, 21, …
…, 11, 19, …
11
B
B
B
Cached FiltersB, B, B, …
GET(16)
Bloom filters speed up point queries in RocksDB
![Page 34: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/34.jpg)
LN-2
LN-1
LN
…, 6, 20, …
…, 12, 21, …
…, 11, 19, …
11
B
B
B
Cached FiltersB, B, B, …
GET(16)NO
Bloom filters speed up point queries in RocksDB
![Page 35: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/35.jpg)
LN-2
LN-1
LN
…, 6, 20, …
…, 12, 21, …
…, 11, 19, …
12
B
B
B
Cached FiltersB, B, B, …
SEEK(14, 18)
Bloom filters can’t help range queries in RocksDB
![Page 36: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/36.jpg)
LN-2
LN-1
LN
…, 6, 20, …
…, 12, 21, …
…, 11, 19, …
12
B
B
B
Cached FiltersB, B, B, …
SEEK(14, 18)
Bloom filters can’t help range queries in RocksDB
![Page 37: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/37.jpg)
LN-2
LN-1
LN
…, 6, 20, …
…, 12, 21, …
…, 11, 19, …
13
S
S
S
Cached FiltersS, S, S, …
SuRFs can benefit both point and range queries
![Page 38: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/38.jpg)
LN-2
LN-1
LN
…, 6, 20, …
…, 12, 21, …
…, 11, 19, …
13
S
S
S
Cached FiltersS, S, S, … SEEK(14, 18)
GET(16)NO
SuRFs can benefit both point and range queries
![Page 39: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/39.jpg)
Evaluation setup: a time-series benchmark
14
Time
Key: 64-bit timestamp + 64-bit sensor IDValue: 1KB payload
![Page 40: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/40.jpg)
Evaluation setup: a time-series benchmark
14
Time
Key: 64-bit timestamp + 64-bit sensor IDValue: 1KB payload
SEEK(t1, t2)GET(t)
t t1 t2
Queries:
![Page 41: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/41.jpg)
Evaluation setup: a time-series benchmark
14
Time
Key: 64-bit timestamp + 64-bit sensor IDValue: 1KB payload
SEEK(t1, t2)GET(t)
t t1 t2
Queries:
System Config
Dataset: ≈100 GB on SSD
DRAM: 32 GB
![Page 42: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/42.jpg)
Evaluation setup: a time-series benchmark
14
Time
Key: 64-bit timestamp + 64-bit sensor IDValue: 1KB payload
SEEK(t1, t2)GET(t)
t t1 t2
Queries:
Filter ConfigBloom filter: 14 bits per key
SuRF: 4-bit real suffix
System Config
Dataset: ≈100 GB on SSD
DRAM: 32 GB
![Page 43: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/43.jpg)
15
SuRFs still benefit point queries in RocksDB
0
10
20
30
40
Th
rou
gh
pu
t (K
op
s/s)
No Filter Bloom Filter SuRF
All-false point queries
Worst-case Gap
![Page 44: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/44.jpg)
0
2
4
6
8
10T
hro
ug
hp
ut
(Ko
ps/
s)
Percent of queries with empty results
10 20 30 40 50 60 70 80 90 99
No Filter/ Bloom Filter
SuRF
16
SuRFs speed up range queries in RocksDB
![Page 45: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/45.jpg)
0
2
4
6
8
10T
hro
ug
hp
ut
(Ko
ps/
s)
Percent of queries with empty results
10 20 30 40 50 60 70 80 90 99
SuRF
16
5x
SuRFs speed up range queries in RocksDB
No Filter/ Bloom Filter
![Page 46: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/46.jpg)
Conclusion
17
SuRF is a fast and compact data structureoptimized for range filtering
github.com/efficient/SuRF
rangefilter.io[Demo]
![Page 47: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/47.jpg)
Backup Slides
![Page 48: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/48.jpg)
Comparing ARF to SuRF
B1
Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries (ARF uses 2M queries for training)
Bits per Key (held constant)
Range Query Throughput (Mops/s)
False Positive Rate
Build Time (s)
Build Memory (GB)
Training Time (s)
Training Throughput (Mops/s)
ARF SuRF Improvement
14
0.16
25.7
118
26
117
0.02
14
3.3
2.2
1.2
0.02
N/A
N/A
-
20x
12x
98x
1300x
N/A
N/A
![Page 49: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/49.jpg)
LOUDS-Sparse encoding example
B2
a i
h t f t
v1 v2 v3 v4 v5
a i d h t f t1 1 0 0 0 0 0
1 0 1 0 0 1 0v1 v2 v3 v4 v5
Label:
Structure:
Has-child:
Value:
LOUDS-Sparse
d
10N bitsTheoretic Limit ≈ 9.4N bits
moveToChild (p) = select(S, rank(HC, p) + 1)
![Page 50: SuRF: PRACTICAL RANGE FILTERING WITH FAST SUCCINCT …huanche1/slides/surf-sigmod18.pdfComparing ARF to SuRF B1 Experiment: insert 5M 64-bit integers, 10M Zipf-distributed range queries](https://reader034.fdocuments.in/reader034/viewer/2022052013/6029af16c1a7d76c1d624dd3/html5/thumbnails/50.jpg)
LOUDS-DS trades small space for performance
B3
LOUDS-Dense
LOUDS-Sparse
Hot
Cold
space overhead
speed-up
< 1%
3x