CHC ++: Coherent Hierarchical Culling Revisited
description
Transcript of CHC ++: Coherent Hierarchical Culling Revisited
CHC ++:Coherent Hierarchical Culling
Revisited
Oliver Mattausch, Jiří Bittner,Michael Wimmer
Institute of Computer Graphics and Algorithms
Vienna University of Technology
Oliver Mattausch 2
CHC++: Fast occlusion culling algorithm
Oliver Mattausch 3
Occlusion Culling
Render only visible geometry → Output sensitivityPreprocessing vs.Online occlusion culling
Preprocessing: Visibility from a region
Oliver Mattausch 4
Online Occlusion Culling
Query visibility from view point+ No preprocessing+ Dynamic Scenes
Hardware occlusion queries → # visible pixels
Query bounding box Render geometry
Visible?
Oliver Mattausch 5
Naive method: Hierarchical Stop & Wait
For each node: Issue queryVisible → traverse subtreeInvisible → cull subtree
Problem: Query latencyCPU stalls GPU starvation
Cull Render
Hierarchical Culling
Render
Oliver Mattausch 6
Previous Work
Coherent Hierarchical Culling (CHC) [Bittner04]Near Optimal Hierarchical Culling (NOHC) [Guthe06]
Oliver Mattausch 7
CHC
While waiting for query result → traverse / renderKeep query queue
Use coherence, assume node stays (in)visibleFor previously visible nodes
Don‘t wait for query result
Issue query
Render geometry
Use result in next frameResult
available?
Oliver Mattausch 8
Problems of CHC Too many queries Not GPU friendly
Many state changesBounding box query (8 vertices per draw call)
Can be slower (!) than view frustum culling (VFC)
Most houses visible → Bad view point for CHC
Oliver Mattausch 9
Properties of NOHC+ Query only if cheaper than rendering+ Mostly better than view frustum culling+ Close to self-defined optimum - Hardware calibration step- Complex set of rules Possible to beat the defined optimum
Can reduce cost of queries Can further reduce # queries
Oliver Mattausch 10
Improved algorithm: CHC ++
Reduction ofState changesQueriesWait timeRendered geometry
Keeps simplicity of CHC
Oliver Mattausch 11
Building Blocks of CHC ++
Query batchingReduction of state changesReduction of CPU stalls
MultiqueriesReduction of queries
Randomization Better distribution of queries
Tight bounding volumesReduction of queriesReduction of rendered geometry
Oliver Mattausch 12
Switch between render / query mode → Need state change (depth write on / off)CHC induces one state change per queryBig overhead on modern GPUs!
Query Batching: State Changes
Idea: Store query candidates in separate queueCollect n nodesSwitch to query modeQuery all nodes
State change
Oliver Mattausch 13
Query Batching: Previously invisible nodes
Query
Query
Query
Query
Query
Candidate queue Query queue
State change Render
mode
Oliver Mattausch 14
Previously visible nodesNo dependencies (geometry rendered anyway)Can issue query at any timeHandle them in separate queue
Issue queries to fill up wait timeVery likely no new state change
Issue rest of queries in next frame
Query Batching: Previously visible nodes
Oliver Mattausch 15
Query Batching: Visualization
CHC: ~100 state changes CHC++: 2 state changes(Max. batch size: 50)
Each color represents a state change
Oliver Mattausch 16
Node invisible for long timeLikely to stay invisible (e.g., car engine block)Cover many nodes with single query
Test q invisible nodes by single multiqueryInvisible → saved (q – 1) queriesVisible → must test individually, wasted one query
Multiqueries: Idea
Oliver Mattausch 17
Use history of nodesEstimate probability that node will still be invisible in frame n if it was invisible in frame n - 1Measurements behave like certain exp() function → sufficient in practice
Multiqueries : Minimize #queries
Fitted andmeasured functions
Oliver Mattausch 18
While node batch not emptyAdd node to multiquery
Use cost / benefit modelQuery size optimal → issue multiquery
Multiqueries: Greedy Algorithm
Visualization: Each color represents a multiquery
Oliver Mattausch 19
Queues in CHC++
traversal queue
v-queue(visible nodes)
i-queue (invisible nodes)
query queueMultiquery
Oliver Mattausch 20
Test previously visible nodes each frame → queries wastedAssume visible for t frames
Frame rate drops every t frames
Randomization: Assumed Visibility
Q Q QQ Q Q
Q Q QQ Q Q
Oliver Mattausch 21
When node becomes visibleRandomize first invocation between [1.. t]Afterwards, test every t frames
Randomization: Idea
Q Q QQ Q Q
Q Q QQ Q Q
Oliver Mattausch 22
+ Even distribution over frames+ Nodes tested in regular intervals+ Very stable for t between 5 - 10
Tried sophisticated models But they could not beat the simple randomization!
Randomization: Properties
Optimization for bounding volume hierarchy (BVH)For each node → query bounding boxes of children(using single query)Child boxes invisible → Cull node, saved 2 queries Box visible → Traverse node
Oliver Mattausch 23
Tight Bounding Volumes: Idea
Oliver Mattausch 24
Subdivide deeper than actual hierarchy depth → tight bounds also for leavesBetter adjusts to shape of objects
Tight Bounding Volumes for Leaves
Tight bounds shown in red
Oliver Mattausch 25
+ Bounds for interiors reduce queries+ Bounds for leaves reduce geometry
(without overhead of deeper hierarchy)+ More boxes per query not relevant for hardware
Tight Bounding Volumes: Properties
Oliver Mattausch 26
Results
Powerplant (12M triangles)
Pompeii (6M triangles)
Oliver Mattausch 27
Results: Powerplant (12M triangles)
Oliver Mattausch 28
Results: Powerplant (12M triangles)
CHC + Batching + Randomization + Tight Bounds + Multiqueries
Oliver Mattausch 29
Results: Pompeii (6M triangles)
+ Hierarchical culling algorithm based on CHC+ Kept simplicity of CHC+ Improved several issues of CHC+ Always better than view frustum culling+ Up to 2 – 3 times speedup+ Better than optimum as defined by NOHC Drawback: several parameters
Conclusions
Oliver Mattausch 30
Oliver Mattausch 31
Multiqueries: Vienna (1M triangles)
Oliver Mattausch 32
Questions?
THANK YOU FOR YOUR ATTENTION!
Oliver Mattausch 33
Rendering in modern enginesCollect visible objects in render queueSort by materialsRender everything at once
Rendering single nodes is inefficient (CHC)With batching:
Traverse render queueonly before switch to query mode
Batching: Render engine integration
Future Work
(Semi-)automatically find optimal values for parametersCalibrate parameters during representative walkthroughRemove overhead also for difficult view points
Oliver Mattausch 34