LOUDS: Succinct Data Structure
Click here to load reader
-
Upload
yoh-okuno -
Category
Technology
-
view
858 -
download
0
Transcript of LOUDS: Succinct Data Structure
![Page 1: LOUDS: Succinct Data Structure](https://reader037.fdocuments.in/reader037/viewer/2022100601/55794dddd8b42a31678b5276/html5/thumbnails/1.jpg)
LOUDS: Succinct Tree Data StructureYoh Okuno
![Page 2: LOUDS: Succinct Data Structure](https://reader037.fdocuments.in/reader037/viewer/2022100601/55794dddd8b42a31678b5276/html5/thumbnails/2.jpg)
LOUDS (Level Order Unary Degree Sequence)
● LOUDS is succinct unlabeled static tree● Succinct means theoretically smallest● Can be extended to labeled tree easily● Can not add new nodes or delete nodes● Applications in Japanese IME [Kudo+ 2011]
[Jacobson, 1989]
![Page 3: LOUDS: Succinct Data Structure](https://reader037.fdocuments.in/reader037/viewer/2022100601/55794dddd8b42a31678b5276/html5/thumbnails/3.jpg)
The most straightforward way to express treeSpace usage is 2 N log2 N bits (N: # of nodes)
struct Node {Node *first_child;Node *next_sibling;
};
pointer representation:448 bits = 2 * 7 * 32 bits
![Page 4: LOUDS: Succinct Data Structure](https://reader037.fdocuments.in/reader037/viewer/2022100601/55794dddd8b42a31678b5276/html5/thumbnails/4.jpg)
Problem
![Page 5: LOUDS: Succinct Data Structure](https://reader037.fdocuments.in/reader037/viewer/2022100601/55794dddd8b42a31678b5276/html5/thumbnails/5.jpg)
LOUDS: Succinct Tree Representation
Use bit vector instead of pointersSpace consumption is 2 N +1 bits
10
101011101010000
only 15 bits!(32x smaller)
101110
10 10
![Page 6: LOUDS: Succinct Data Structure](https://reader037.fdocuments.in/reader037/viewer/2022100601/55794dddd8b42a31678b5276/html5/thumbnails/6.jpg)
Problem
![Page 7: LOUDS: Succinct Data Structure](https://reader037.fdocuments.in/reader037/viewer/2022100601/55794dddd8b42a31678b5276/html5/thumbnails/7.jpg)
Navigational operations
index 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
bit 1 0 1 0 1 1 1 0 1 0 1 0 0 0 0
● getRoot() = 0● isNull(index) = bit[index]● nextSibling(index) = index + 1● firstChild(index): rest of slide
Using index in bit vector to represent node..
101110
10 10
10
Red background means actually stored values
![Page 8: LOUDS: Succinct Data Structure](https://reader037.fdocuments.in/reader037/viewer/2022100601/55794dddd8b42a31678b5276/html5/thumbnails/8.jpg)
Problem
![Page 9: LOUDS: Succinct Data Structure](https://reader037.fdocuments.in/reader037/viewer/2022100601/55794dddd8b42a31678b5276/html5/thumbnails/9.jpg)
0
2
48
63
79
512
13 14
1
1110
Introducing External Nodes
101011101010000
Add virtual nodes to represent 0s in bit vector
![Page 10: LOUDS: Succinct Data Structure](https://reader037.fdocuments.in/reader037/viewer/2022100601/55794dddd8b42a31678b5276/html5/thumbnails/10.jpg)
0
2
48
63
79
512
13 14
1
1110
0
1
25
463
rank1
Convert Index to Internal IDrank1: convert index to internal node ID
![Page 11: LOUDS: Succinct Data Structure](https://reader037.fdocuments.in/reader037/viewer/2022100601/55794dddd8b42a31678b5276/html5/thumbnails/11.jpg)
36 7
4 52
1
0
Getting Child NodeFact: N-th internal node’s last child is (N+1)-th external node
0
1
25
463
+1
![Page 12: LOUDS: Succinct Data Structure](https://reader037.fdocuments.in/reader037/viewer/2022100601/55794dddd8b42a31678b5276/html5/thumbnails/12.jpg)
0
2
48
63
79
512
13 14
1
1110
Convert External ID to Indexselect0: convert external node ID to index
36 7
4 52
1
0select0
![Page 13: LOUDS: Succinct Data Structure](https://reader037.fdocuments.in/reader037/viewer/2022100601/55794dddd8b42a31678b5276/html5/thumbnails/13.jpg)
1. rank1(index): number of 1s before index2. select0(n): index of n-th 0 in bit vector3. firstChild(index) = select0(rank(index)) + 1
How to find first child?
1. 2. 3.
![Page 14: LOUDS: Succinct Data Structure](https://reader037.fdocuments.in/reader037/viewer/2022100601/55794dddd8b42a31678b5276/html5/thumbnails/14.jpg)
rank1(index) = block[index / 4] + table[index % 4]● block: first element of fixed size block● table: relative value to first element (precomputed)
Implementing rank1(index)
index 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
bit 1 0 1 0 1 1 1 0 1 0 1 0 0 0 0
rank 1 1 2 2 3 4 5 5 6 6 7 7 7 7 7
block 1 3 6 7
index 0 1 2 3 4 5 ... 15
table 0 1 1 2 1 2 ... 4
![Page 15: LOUDS: Succinct Data Structure](https://reader037.fdocuments.in/reader037/viewer/2022100601/55794dddd8b42a31678b5276/html5/thumbnails/15.jpg)
Implementing select0(n)Apply binary search to rank0● select0(n): index of n-th 0 = inverse of rank0● rank0(index) = index - rank(index) + 1● speed up by block-aware binary search
index 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
bit 1 0 1 0 1 1 1 0 1 0 1 0 0 0 0
rank 1 1 2 2 3 4 5 5 6 6 7 7 7 7 7
rank0 0 1 1 2 2 2 2 3 3 4 4 5 6 7 8
select0 1 3 7 9 11 12 13 14
![Page 16: LOUDS: Succinct Data Structure](https://reader037.fdocuments.in/reader037/viewer/2022100601/55794dddd8b42a31678b5276/html5/thumbnails/16.jpg)
Performance comparison
Data Structure Size Time
text 917KB N/A
Pointer-based 2.9MB 10s
LOUDS (char) 390KB 15s
LOUDS (marisa) 255KB 6s
Data source: /usr/share/dict/words in Ubuntu LinuxData size: 99,171 wordsQuery: shuffled words * 100
![Page 17: LOUDS: Succinct Data Structure](https://reader037.fdocuments.in/reader037/viewer/2022100601/55794dddd8b42a31678b5276/html5/thumbnails/17.jpg)
Conclusion● Pointers are space confusing!● LOUDS is succinct tree representation● Rank and select are key components● LOUDS is 1/7 in size, 150% time than
pointer● How can marisa so fast and small?
![Page 18: LOUDS: Succinct Data Structure](https://reader037.fdocuments.in/reader037/viewer/2022100601/55794dddd8b42a31678b5276/html5/thumbnails/18.jpg)
References● Space efficient static trees and graphs,
Jacobson, FOCS 1989.● Practical implementation of rank and select
queries, Gonzalez, WEA 2005.● Efficient dictionary and language model
compression for input method editors, Kudo, WTIM 2011.