Accelerating Marching Cubes with Graphics Hardware Gunnar Johansson, Linköping University Hamish...

31
Accelerating Marching Cubes with Graphics Hardware Gunnar Johansson, Linköping University Hamish Carr, University College Dublin
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    216
  • download

    1

Transcript of Accelerating Marching Cubes with Graphics Hardware Gunnar Johansson, Linköping University Hamish...

Accelerating Marching Cubes with Graphics

HardwareGunnar Johansson, Linköping UniversityHamish Carr, University College Dublin

Presentation outline

• Goal• Background• Previous work• Our approach• Results• Conclusions, Future work

Goal

• Isosurface visualization for studying 3D scalar functions– Marching Cubes is standard algorithm

• This work presents GPU acceleration in combination with CPU-based algorithmic acceleration (interval/Kd-trees)

Background

Isosurface visualization

• Goal: study a volumetric scalar function, f(x)

• Isosurface is a set of points with equal isovalue (h)

{ x : f(x) = h }

Illustration by

Stefan Roettger, University of Erlangen

Marching Cubes

• Each corner of a cube is classified asabove (black), or below(white), a given isovalue

• Vertices of surface is linearly interpolated along the edges

• Normals are computed usingcentral differences and interpolated along the edges

Marching Cubes

Example application

• Studying medicaldatasets– MRI, CT scans

Previous work

Previous workAlgorithmic acceleration

• Original marching cubes visits all cells in the dataset– O(N) in time complexity, N = number of cells

• However, an isosurface is expected to intersect only a fraction of the cells

• Efficient search structures can be used to store maximum and minimum value of each cell– Kd-tree O(√N + k), k = size of isosurface– Interval tree O(log N + k)

Previous workGPU acceleration

• Restricted to tetrahedral cells– Marching tetrahedra

• Pascucci, 04• Klein et al, 04• Reck et al, 04

• Cannot create/delete vertices on GPU– “Worst-case” strategy– Always fed 4 vertices (a quad) to the

GPU

Previous workGPU acceleration

• CPU tasks– Selects cell and sends data to GPU

• GPU tasks– Classifies cell– Interpolates surface vertices– Compute normals (per face)

• Bottleneck?– Data transfer CPU – GPU

Previous workGPU acceleration

• Parallel to our work– Goetz et al, “Real-time marching cubes

on the vertex shader”, 05

• Classifies cell on both CPU and GPU• Do not apply interval/Kd-trees• Only computes face normals

Previous work

• Traditional pipeline(with accelerating search structures)

Previous work

• GPU accelerated pipeline(by Pascucci / Reck et al / Klein et al)

Our approach

Our approach

• Marching cubes on GPU: Basic challenges– Cannot create vertices on GPU– Too costly to send all possible

triangulations (“worst-case” strategy)

Our approach

• “Caching cell topology”– Store each case triangulation on the

GPU using display lists– Classify cell on CPU and invoke

corresponding display list– Minimize CPU – GPU bandwidth

bottleneck by storing dataset on GPU

Our approach

• “Caching cell topology”

Our approach

• Display list stores corner indices• Use indices for texture lookup• Use values from texture

to interpolate verticesand normals

0

7 6

54

3 2

1(0,1)

(0,3)

(0,4)

Our approach

• Case classification still a CPU bottleneck?

Our approach

• Accelerate case classification by“pre-computing cell topology”

• Pre-compute possiblecases for each cell

• Store all intervalswith correspondingcase in intervalor Kd-tree

Our approach

• “Case interval/Kd-tree”– Shifts case classification to pre-

computation– Storage requirements increase for noisy

dataset (as much as 7 times)

Results

Results

• First approach– Store dataset packed in 2D 1-channel

float texture– Central differencing on GPU for normals– Results disappointing

• 1.2-1.6 speedup for Marching Cubes without accelerating structures

• Even decrease in speedup when using accelerating structures: GPU bottleneck

Results

• Vertex texture support is currently poor– Only 2D 1/4–channel floats– High latency

• Central differencing– 12 texture lookups per vertex normal

• 14 lookups in total for each vertex

Results

• Second approach– Pre-compute normals and store dataset and

normals packed in 2D 4-channel float texture

– Only need 2 lookups for each vertex– Results improved

• Speedup of 3-4 times compared with CPU counterpart

• 128x128x128 “Hydrogen atom” dataset– Interval tree + CPU: 27 fps– Interval tree + GPU: 112 fps (4 times speedup)

Conclusions

• Accelerating isosurface extractionusing GPU– Cache all possible cell triangulations

(cases)– Use CPU for classification– Use GPU for interpolation– Optimize CPU classification by pre-

computing all possible cases (case interval/Kd-tree)

Conclusions

• Applicable to any interpolant (in this work described using Marching Cubes)

• Current hardware impose restrictions– Float textures, high latency for vertex

texture lookup

Future work

• Move computation to fragment processor– More powerful than vertex processor– Better, more efficient texture support– Ability to download (to CPU) the

extracted surface

• Optimize memory usage (texture/system)

• Apply to higher-order interpolants

Thank you

Acknowledgements:Thanks to UCD for funding