Fast BVH Construction on GPUs (Eurographics 2009)

42
Fast BVH Construction on GPUs (Eurographics 2009) Park, Soonchan KAIST (Korea Advanced Institute of Science and Technology)

description

Fast BVH Construction on GPUs (Eurographics 2009). Park, Soonchan KAIST (Korea Advanced Institute of Science and Technology). Contents. What is BVH Motivation Three Algorithm to Construct BVH LBVH SAH Hierarchy Construction Hybrid GPU Construction Algorithm Results & Analysis. Contents. - PowerPoint PPT Presentation

Transcript of Fast BVH Construction on GPUs (Eurographics 2009)

Page 1: Fast BVH Construction on GPUs (Eurographics 2009)

Fast BVH Construction on GPUs(Eurographics 2009)

Park, Soonchan

KAIST (Korea Advanced Institute of Science and Technology)

Page 2: Fast BVH Construction on GPUs (Eurographics 2009)

2

Contents

● What is BVH● Motivation● Three Algorithm to Construct BVH

● LBVH● SAH Hierarchy Construction● Hybrid GPU Construction Algorithm

● Results & Analysis

Page 3: Fast BVH Construction on GPUs (Eurographics 2009)

3

Contents

● What is BVH● Motivation● Three Algorithm to Construct BVH

● LBVH● SAH Hierarchy Construction● Hybrid GPU Construction Algorithm

● Results & Analysis

Page 4: Fast BVH Construction on GPUs (Eurographics 2009)

4

What is BVH?

● Bounding Volume Hierarchy● A tree structure on a set of geometric

objects● “Fast Computation”

● Ray tracing ● Collision detection● Visibility Culling

Page 5: Fast BVH Construction on GPUs (Eurographics 2009)

5

What is BVH?

● Issues of BVH construction● Construction Time● Effectiveness of Construction

●How much improvement BVH makes– Median Subdivision & Surface Area Heuristic

Page 6: Fast BVH Construction on GPUs (Eurographics 2009)

6

Motivation

● BVH Construction

Almost all prior works are about “Purely serial construction algorithms”

Make Efficient Parallel algorithms! on manycore processors

How to make processes of BVH construction appropriate for parallel computation

Page 7: Fast BVH Construction on GPUs (Eurographics 2009)

7

Contents

● What is BVH● Motivation● Three Algorithm to Construct BVH

● LBVH● SAH Hierarchy Construction● Hybrid GPU Construction Algorithm

● Results & Analysis

Page 8: Fast BVH Construction on GPUs (Eurographics 2009)

8

Contents

● What is BVH● Motivation● Three Algorithm to Construct BVH

● LBVH● SAH Hierarchy Construction● Hybrid GPU Construction Algorithm

● Results & Analysis

Page 9: Fast BVH Construction on GPUs (Eurographics 2009)

9

LBVH

● Linear Bounding Volume Hierarchy● Simplest approach to parallelizing BVH

Construction● Sorting input primitives by Morton Codes● BVH Construction Sorting ( O(nlogn) )

Page 10: Fast BVH Construction on GPUs (Eurographics 2009)

10

Morton Codes (Z-order)

● Space-filling curve

● Morton Codes (Z-order)● Good locality-preserving● Express space as bits

Page 11: Fast BVH Construction on GPUs (Eurographics 2009)

11

Morton Codes (Z-order)

Page 12: Fast BVH Construction on GPUs (Eurographics 2009)

12

LBVH

● Linear B.V.H.● Sorting primitives

along the curveparallel radix sort[SHG08]

● Each primitive hasbit expression of position

● How to makethe Hierarchy?

Page 13: Fast BVH Construction on GPUs (Eurographics 2009)

13

LBVH

● Make Hierarchy● Test all Primitive i with Primitive i+1

●What levels they are separated●Make list ( (Primitive index) , ( separate level) )

● Resort the list by level

We can have intervals at each level!

Page 14: Fast BVH Construction on GPUs (Eurographics 2009)

14

Example

(6, 1)

(3, 2) (6, 2)

(2,3) (3,3) (4,3) (5,3) (6,3) (7,3)

(1,4) (2,4) (3,4) (4,4) (5,4) (6,4) (7,4)

Split list (Prim.Index, Separate Lev.)

Page 15: Fast BVH Construction on GPUs (Eurographics 2009)

15

7 81 2 3 4 5 6 LEVEL 1

1 2 3 4 5 6 7 8

Page 16: Fast BVH Construction on GPUs (Eurographics 2009)

16

7 81 2 3 4 5 6

123 456

LEVEL 1

LEVEL 2

1 2 3 4 5 6 7 8

Page 17: Fast BVH Construction on GPUs (Eurographics 2009)

17

7 81 2 3 4 5 6

123 456

312 4 5 6 7 8

LEVEL 1

LEVEL 2

LEVEL 3

1 2 3 4 5 6 7 8

Page 18: Fast BVH Construction on GPUs (Eurographics 2009)

18

7 81 2 3 4 5 6

123 456

312 4 5 6 7 8

1 2

LEVEL 1

LEVEL 2

LEVEL 3

LEVEL 4

1 2 3 4 5 6 7 8

Page 19: Fast BVH Construction on GPUs (Eurographics 2009)

19

Page 20: Fast BVH Construction on GPUs (Eurographics 2009)

20

LBVH

● Pros● Very fast – same complexity as sorting

●+ we use parallel radix sort [SHG 08]● Cons

● Constructed Hierarchy is not optimized●It uniformly subdivides space at the median

● Leaf can has multiple primitives

Page 21: Fast BVH Construction on GPUs (Eurographics 2009)

21

Contents

● What is BVH● Motivation● Three Algorithm to Construct BVH

● LBVH● SAH Hierarchy Construction● Hybrid GPU Construction Algorithm

● Results & Analysis

Page 22: Fast BVH Construction on GPUs (Eurographics 2009)

22

What is SAH

● Surface Area Heuristic● Answer for optimized architecture

●“which of a number of partitions of primitives will be better?●“which of a number of possible positions to split space will be better?”

Page 23: Fast BVH Construction on GPUs (Eurographics 2009)

23

What is SAH

● SAH optimized construction can also be achieved in O(nlogn) [WH06]

● Processes for SAH● Recursively splitting the set of geometric

primitives (usually two parts per step-binary tree)

● Evaluate with “cost function” ●Cost function can be defined

● Find the one with lowest cost● Check all possible split position can be

costly● Sampling method can be applied

Page 24: Fast BVH Construction on GPUs (Eurographics 2009)

24

GPU SAH Construction

● Breadth-first construction using work queues

● Parallelization!

Input queue

Output queue

Output queue

Page 25: Fast BVH Construction on GPUs (Eurographics 2009)

25

Data-Parallel SAH Split

● Two steps for performing SAH split● Determine the best split position by

evaluating the SAH● Reorder the primitives ( corresponds to

the new split )

Page 26: Fast BVH Construction on GPUs (Eurographics 2009)

26

Data-Parallel SAH Split

● Determine the best split position● Approximate SAH computation● Generate k uniformly sampled split

candidates for three axes ( test all the samples in parallel by using 3k threads )

● Each thread computes the SAH cost for its split candidate

● Find split candidate with lowest cost● Reorder the Primitives

● In corresponds to the new splits● Only reorder the indices

●No copy of geometry

Page 27: Fast BVH Construction on GPUs (Eurographics 2009)

27

Small Split Operation

● Two main bottleneck● Initial split at the top level of hierarchy is very

slow ● Large # of primitives at Top level

– By using hybrid method (discussed later)● Large # of small splits at Low level

● Problems●Higher compaction costs generated by large # of splits●Vector utilizing is low (Few primitive per split)

● Large # of small size of split makes problem Use different split kernel for small size

Page 28: Fast BVH Construction on GPUs (Eurographics 2009)

28

Small Split Operation

● Main Idea● Set Thresh hold to define “Small split”

●Depends on geometry data & cache size (32)

● Use processor’s local memory●to maintain a local work queue●Keep all the geometric primitives

● Pros● Reduce memory bandwidth● Decrease # of Thread

●Maximize utilization of vector operation● Avoid waiting for memory access

15~20% speed up

Page 29: Fast BVH Construction on GPUs (Eurographics 2009)

29

Small Split Operation

Times# of

active splits

Level of splits

Page 30: Fast BVH Construction on GPUs (Eurographics 2009)

30

Contents

● What is BVH● Motivation● Three Algorithm to Construct BVH

● LBVH● SAH Hierarchy Construction● Hybrid GPU Construction Algorithm

● Results & Analysis

Page 31: Fast BVH Construction on GPUs (Eurographics 2009)

31

Hybrid GPUConstruction Algorithm

● LBVH● Not optimized at last

● Shallow hierarchy● Large # of primitives at the leafs

● But FAST● Problem of GPU SAH Construction

● Relatively Slow● Overhead at first level● But it can build optimized hierarchy

● Solution● Top level use LBVH● Others use GPU SAH Construction

Page 32: Fast BVH Construction on GPUs (Eurographics 2009)

32

Contents

● What is BVH● Motivation● Three Algorithm to Construct BVH

● LBVH● SAH Hierarchy Construction● Hybrid GPU Construction Algorithm

● Results & Analysis

Page 33: Fast BVH Construction on GPUs (Eurographics 2009)

33

Results

● Render several scenes● Comparing with other environments

● One-core not optimized CPU SAH● Full SAH

● Standard CPU BVH ray tracer using ray packets

● Compare with● Construction time, Well Optimized, fps

Page 34: Fast BVH Construction on GPUs (Eurographics 2009)

34

Results

Construction TimeAbsolute/relative r.t. perf.

Page 35: Fast BVH Construction on GPUs (Eurographics 2009)

35

Results

Construction TimeAbsolute/relative r.t. perf.

Page 36: Fast BVH Construction on GPUs (Eurographics 2009)

36

Results

Construction TimeAbsolute/relative r.t. perf.

Page 37: Fast BVH Construction on GPUs (Eurographics 2009)

37

Results

● GPU SAH● Show better performance than CPU SAH● Good optimization

● LBVH● Fast, not optimized● Scene dependent

● Hybrid● Middle of GPU SAH & LBVH● can be customized

Page 38: Fast BVH Construction on GPUs (Eurographics 2009)

38

Analysis

● Current GPU architecture several features for constructing hierarchy● Special Graphics memory

significantly higher memory bandwidth● Manage fast local memory

●Discussed in Small Split Operation

● Memory● 113 bytes/triangle

●Worst case: when one triangle per leaf It allows multi-million triangle models on current GPU

Page 39: Fast BVH Construction on GPUs (Eurographics 2009)

39

Analysis

● Bottleneck Analysis

Core overhead

Memory overhead

Page 40: Fast BVH Construction on GPUs (Eurographics 2009)

40

Analysis

● Time Distribution

*Rest = read/write BVH node information,

setting up splits, join rest of steps

“Note that Hybrid build is 10 times faster”

Full SAH build Hybrid build

Page 41: Fast BVH Construction on GPUs (Eurographics 2009)

41

Video

● Youtube Video

Page 42: Fast BVH Construction on GPUs (Eurographics 2009)

42

Reference

● [SHG08] SATISH N., HARRIS M., GARLAND M.: Designing efficient sorting algorithms for manycore GPUs. Under review (2008).

● [WH06] WALD I., HAVRAN V.: On building fast kd-trees for ray tracing, and on doing that in O(N log N). In Proc. of IEEE Symp.on Interactive Ray Tracing (2006), pp. 61–69.