CHOPIN: Scalable Graphics Rendering in Multi-GPU Systems ...
GPU-Based Frequency Domain Volume Rendering
description
Transcript of GPU-Based Frequency Domain Volume Rendering
GPU-Based Frequency GPU-Based Frequency Domain Volume RenderingDomain Volume Rendering
Ivan Viola, Armin Kanitsar, and Ivan Viola, Armin Kanitsar, and MeisterMeister Eduard Gr Eduard Grölleröller
Institute of Computer Graphics and AlgorithmsInstitute of Computer Graphics and Algorithms
Vienna University of TechnologyVienna University of Technology
2 / 16Ivan Viola Vienna University of Technology
MotivationMotivationvolume rendering is time consumingvolume rendering is time consumingcomputational complexity is computational complexity is O(NO(N33))our our goalgoal: fastest volume rendering: fastest volume rendering
GPUsGPUs very fast fragment processorvery fast fragment processor very fast memory accessvery fast memory access
Fourier Volume Rendering (FVR)Fourier Volume Rendering (FVR) theoretically fastest volume renderingtheoretically fastest volume rendering
3 / 16Ivan Viola Vienna University of Technology
GPUGPU
Frequency Domain Volume RenderingFrequency Domain Volume Rendering
CPUCPU
4 / 16Ivan Viola Vienna University of Technology
FVR CharacteristicsFVR CharacteristicsProsPros computational complexity computational complexity O(NO(N2 2 log(N))log(N)) renders the whole volume not iso-surfacesrenders the whole volume not iso-surfaces very fast rendering stage:very fast rendering stage:
slicing in frequency domainslicing in frequency domain inverse 2D Fourier transforminverse 2D Fourier transform
ConsCons rendering results into rendering results into X-rayX-ray images images time-consuming preprocessingtime-consuming preprocessing
5 / 16Ivan Viola Vienna University of Technology
Rendering Stage 1: SlicingRendering Stage 1: Slicing stage with the highest speed-upstage with the highest speed-up nearest neighbor interpolationnearest neighbor interpolation
supported by GPUsupported by GPU tri-linear interpolationtri-linear interpolation tri-cubic interpolationtri-cubic interpolation windowed windowed sinc sinc of width fourof width four
6 / 16Ivan Viola Vienna University of Technology
Tri-Linear InterpolationTri-Linear Interpolationnot not nativelynatively supported by graphics hardware supported by graphics hardwarecan be computed using the can be computed using the LRPLRP instruction instruction
[1,1][1,1]
[0,0][0,0]
[X,Y][X,Y]
frac(8X)frac(8X)
7 / 16Ivan Viola Vienna University of Technology
Cubic Interpolation & Windowed Cubic Interpolation & Windowed sincsincnot natively supported by graphics hardwarenot natively supported by graphics hardwareno equivalent to no equivalent to LRPLRP instruction instructionfilterfilter kernel kernel stored in textures stored in textures [Hadwiger et al. VMV’01][Hadwiger et al. VMV’01]
separability of 3D kernelseparability of 3D kernel filters of width four filters of width four stored in RGBA stored in RGBA
1D texture1D texture
8 / 16Ivan Viola Vienna University of Technology
Rendering Stage 2: Inverse 2D FFTRendering Stage 2: Inverse 2D FFT1D FFT consists of two parts1D FFT consists of two parts
scramblingscrambling butterfly operationbutterfly operation
SCRAMBLE
HORIZONTAL DIRECTION
BUTTERFLY
BUTTERFLY
VERTICAL DIRECTION
SCRAMBLE
INPUT IMAGE
INVERSETRANSFORMNORMALIZ.
9 / 16Ivan Viola Vienna University of Technology
Fast Fourier Transform in 1DFast Fourier Transform in 1Daa00
aa11
aa22
aa33
aa44
aa55
aa66
aa77
aa00
aa44
aa22
aa66
aa11
aa55
aa33
aa77scramblescramble
11
-1-1
11
-1-1
11
-1-1
11
-1-1
WWkkNN
WW0088
WW2288
WW4488
WW6688
WW0088
WW2288
WW4488
WW6688butterflybutterfly
WW0088
WW1188
WW2288
WW3388
WW4488
WW5588
WW6688
WW7788
AA00
AA11
AA22
AA33
AA44
AA55
AA66
AA77
10 / 16Ivan Viola Vienna University of Technology
Fast Fourier Transform on the GPUFast Fourier Transform on the GPU two buffers – ping-pong renderingtwo buffers – ping-pong rendering two channels rendering buffers requiredtwo channels rendering buffers required scramble passscramble pass
1D lookup1D lookup butterfly passesbutterfly passes
loglog22(N) passes(N) passes texture encodestexture encodes
WWkkNN
pp and and qq coordinate coordinate butterfly signbutterfly sign
11 / 16Ivan Viola Vienna University of Technology
Hartley Transform - Alternative to FFTHartley Transform - Alternative to FFTreal input is transformed into real outputreal input is transformed into real output
½ memory requirements½ memory requirementsscrambling the same as in FFTscrambling the same as in FFTdouble-butterfly operationdouble-butterfly operation
three source values, cos and sinthree source values, cos and sinHT not separableHT not separable
additional additional correctioncorrection pass required pass required
GPU implementation not faster than FFTGPU implementation not faster than FFT
12 / 16Ivan Viola Vienna University of Technology
Fast Hartley Transform on the GPUFast Hartley Transform on the GPU similar to FFT – ping-pong renderingsimilar to FFT – ping-pong rendering only one channel rendering buffers requiredonly one channel rendering buffers required scrambling the samescrambling the same double-butterflydouble-butterfly
two lookup texturestwo lookup textures addresses of source values (3 channels)addresses of source values (3 channels) cos and sin terms (2 channels)cos and sin terms (2 channels)
13 / 16Ivan Viola Vienna University of Technology
ResultsResults
Framerates for ATI Radeon 9800 XTFramerates for ATI Radeon 9800 XT
ResolutionResolution NNNN TLTL TCTC IFFT 2DIFFT 2D
256x256256x256 14501450 10501050 180180 153153
512x512512x512 500500 350350 4545 3535
14 / 16Ivan Viola Vienna University of Technology
DemoDemo
15 / 16Ivan Viola Vienna University of Technology
ConclusionsConclusions rendering stage of FVR very fast on GPUrendering stage of FVR very fast on GPU
slicing – high performance gainslicing – high performance gain wrap around is “for free”wrap around is “for free” speed-up also for inverse FFTspeed-up also for inverse FFT
nearest neighbour – very poor qualitynearest neighbour – very poor quality tri-linear interpolation – high performacetri-linear interpolation – high performace tri-cubic interpolation – high qualitytri-cubic interpolation – high quality
16 / 16Ivan Viola Vienna University of Technology
Thank You!Thank You!
[email protected]@cg.tuwien.ac.at