ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

30
ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS 1 Authors: Steffen Zeuch, Frank Huber, Johann- Christoph Freytag Humboldt-Universität zu Berlin {zeuchste,huber,freytag}@informatik.hu- berlin.de

description

ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS. Authors: Steffen Zeuch, Frank Huber, Johann-Christoph Freytag Humboldt-Universität zu Berlin {zeuchste,huber,freytag}@informatik.hu-berlin.de. 1. Motivation. B + -Tree: common index structure - PowerPoint PPT Presentation

Transcript of ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

Page 1: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

1

Authors: Steffen Zeuch, Frank Huber, Johann-Christoph FreytagHumboldt-Universität zu Berlin

{zeuchste,huber,freytag}@informatik.hu-berlin.de

Page 2: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

MotivationB+-Tree: common index structure

Common node-internal search algorithm:

Binary search in O(log2n)

2

Can we do better?

Yes with SIMD!

Page 3: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

Outline1. Background

2. Binary Search and SIMD

3. Segmented Tree

4. Segmented Trie

5. Evaluation

6. Conclusion

3

Page 4: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

Single Instruction Multiple Data:

Available on CPU and GPU

Arithmetical, comparison, conversion, logical

SIMD

4

3 2

5 4

+2 +2

Add const to vector

Add two vectors

3 2 65 67

67 69

+

Compare two vectors

0 -1

673 265

Page 5: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

Binary Search

5SeparatorSearch KeyExcludedSearch Space

Iteration

1

2

3

4

5

Search Key = 9

Page 6: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

Outline1. Background

2. Binary Search and SIMD

3. Segmented Tree

4. Segmented Trie

5. Evaluation

6. Conclusion

6

Page 7: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

Binary Search - two Separator

7SeparatorSearch KeyExcludedSearch Space

Search Key = 9Iteration

1

2

3

Page 8: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

Binary Search + SIMD

8

0 -1

SIMD Register C

>=

Separator

Search KeyExcludedSearch Space

8 17 9 9SIMD

Register A

SIMD Register

B

Page 9: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

Problem: SIMD on CPU

SIMD on CPU do not support Scatter and Gather functionality.

9

8 94 x 32-bit

SIMD Register 10 11

SIMD load(start position)

Page 10: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

Solution: K-ary Search by Schlegel et al.

10SeparatorSearch KeyExcludedSearch Space

3-ary Search Tree(k = 3)

Linearized Order

Search Key = 9

Page 11: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

Applied K-ary Search

11SeparatorSearch KeyExcludedSearch Space

3-ary Search Tree

Linearized Order Search Key = 91

2

3

Steffen Zeuch
Offset Calculation
Page 12: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

Degree of Parallelism

12

SIMDBandwidth

SearchMethod

DataType

ParallelComparisons

128-bit

K-arySearch

8-bit 16

16-bit 8

32-bit 4

64-bit 2

BinarySearch

All 1

Page 13: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

Outline1. Background

2. Binary Search and SIMD

3. Segmented Tree

4. Segmented Trie

5. Evaluation

6. Conclusion

13

Page 14: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

Segmented Tree

14

Change inner-node search algorithm from commonlybinary search to k-ary search.

Page 15: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

Problem: Unfilled Nodes

15

3-ary Search Tree

Linearized Order

K-ary requirement: multiple of k-1 keys Smax+1

Page 16: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

ReorderingNew keys require reordering:

Sorting → Inserting → Linearizing

Exceptions:

Empty Node

Key is greater than the largest existing key

16

Page 17: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

Segmented TreeAdvantages:

High resource utilization

Less iterations required

Binary Search: log2n vs. K-ary Search logkn

Disadvantages:

Reordering overhead

Large data types decrease performance

17

Page 18: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

Outline1. Background

2. Binary Search and SIMD

3. Segmented Tree

4. Segmented Trie

5. Evaluation

6. Conclusion

18

Page 19: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

Segmented Trie

19

Key (Dec)

Key (Hex)

Partial Key(Hex)

Level 1

Level 2

Page 20: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

Segmented Trie

20

Page 21: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

Segmented TrieAdvantages:

High SIMD search performance

Prefix compression

Early termination

Disadvantages:

Fix level count

Reordering overhead

21

Steffen Zeuch
64-bit data type: 2 parallel comparisons8-bit data type: 17 parallel comparisons
Page 22: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

SegTree vs. SegTrie

22

SegTreeSegTree SegTrieSegTrie

Derived From

B+-Tree Prefix B-Tree

Number of Iterations Tree Height Max. #Level

(Early termination)

Number of Level Dynamic Static (Pre-defined)

DOP Depends onData Type

16 (8-bit)

Page 23: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

Outline1. Background

2. Binary Search and SIMD

3. Segmented Tree

4. Segmented Trie

5. Evaluation

6. Conclusion

23

Page 24: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

Test SetupHW/SW Configuration:

CPU: Intel Xeon 5520, 4 x 2,26 GHz

L1: 32KB, L2: 256 KB, L3: 8 MB, MM: 8 GB

Cacheline: 128 Byte, SIMD bandwidth: 128 Bit

Windows 7 64-bit Professional

Test Dataset:

Synthetically generated, ascending, starting at 024

Page 25: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

Evaluation: Bitmask

25

Three Algorithms:

1. Bit Shifting

2. Case-Switch

3. PopCnt

0 -1 SIMD Register C

>=

8 17 9 9SIMD Register A

SIMD Register B

Page 26: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

Evaluation: SegTree

26

Page 27: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

Evaluation: SegTrie

27

Page 28: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

Outline1. Background

2. Binary Search and SIMD

3. Segmented Tree

4. Segmented Trie

5. Evaluation

6. Conclusion

28

Page 29: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

Our Contributions B+-Tree and prefix B-Tree using SIMD

Transformation and search algorithm for breadth-first and depth-first data layout

Three algorithms for interpreting a SIMD comparison result

Solution for an arbitrary key count

29

Thanks

Page 30: ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

Backup

30