Data Compression for Hardware-accelerated Volume Rendering

30
computer graphics & computer graphics & visualization visualization Data Compression for Data Compression for Hardware-accelerated Volume Rendering Hardware-accelerated Volume Rendering Jens Schneider Jens Schneider Rüdiger Westermann Rüdiger Westermann Technical University Munich Technical University Munich

description

Jens Schneider Rüdiger Westermann Technical University Munich. Data Compression for Hardware-accelerated Volume Rendering. Motivation. Need to deal with data of increasing size: Large-scale Multi-dimensional Multi-parameter Increasing problems: Compression Representation Rendering - PowerPoint PPT Presentation

Transcript of Data Compression for Hardware-accelerated Volume Rendering

Page 1: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

Data Compression for Data Compression for Hardware-accelerated Volume RenderingHardware-accelerated Volume Rendering

Jens SchneiderJens Schneider

Rüdiger WestermannRüdiger Westermann

Technical University MunichTechnical University Munich

Page 2: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

MotivationMotivation

Need to deal with data of increasing size:Need to deal with data of increasing size:• Large-scaleLarge-scale

• Multi-dimensionalMulti-dimensional

• Multi-parameterMulti-parameter

Increasing problems:Increasing problems:• CompressionCompression

• RepresentationRepresentation

• RenderingRendering

We will adress all three problems!We will adress all three problems!

Page 3: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

Talk OutlineTalk OutlineThe Approach – The Approach – Vector QuantizationVector Quantization

ContributionsContributionsQuality and speedQuality and speed

• Hierachical encodingHierachical encoding• PCA-SplitPCA-Split• Progressive encoding of time-resolved dataProgressive encoding of time-resolved data

Multi-dimensional dataMulti-dimensional data• Vectors of arbitrary lengthVectors of arbitrary length

Rendering from compressed dataRendering from compressed data• GPU-based decoding and renderingGPU-based decoding and rendering• Per-fragment evaluationPer-fragment evaluation• Interactive frameratesInteractive framerates

Page 4: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

Talk OutlineTalk OutlineThe Application – The Application – Volume RenderingVolume Rendering

• Large-scale volumetric data setsLarge-scale volumetric data sets

• Time-varying sequencesTime-varying sequences

16 MB / 14 fps 0.78 MB / 11 fps16 MB / 14 fps 0.78 MB / 11 fps

1.4 GB / 20 fps1.4 GB / 20 fps

70 MB / 24 fps70 MB / 24 fps

Page 5: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

Talk OutlineTalk Outline

The Future – The Future – Video Compression ?Video Compression ?• Video compression techniques very exciting!Video compression techniques very exciting!

• Merge video decoding pipeline and 3D APIMerge video decoding pipeline and 3D API

Promising TechnologiesPromising Technologies• MPEG-II StreamsMPEG-II Streams

• XvMC APIXvMC API

• OpenGL SuperbuffersOpenGL Superbuffers

• Commodity graphics hardware video functionalityCommodity graphics hardware video functionality

Chip vendors just beginning to realize this!Chip vendors just beginning to realize this!

Page 6: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

Vector QuantizationVector Quantization

Codebook Codebook CC

with codewordswith codewords

EncoderEncoderXXnn

iinn=E(X=E(Xnn))

Input mappingInput mapping

DecoderDecoder

X‘X‘nn=C(i=C(inn)) Output mappingOutput mapping

iinn

Page 7: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

Vector QuantizationVector Quantization

LBG-AlgorithmLBG-Algorithm• Linde, Buzo and Gray 1980Linde, Buzo and Gray 1980

• Iterative refinement of a previous CodebookIterative refinement of a previous Codebook

• Sensitive to quality of first CodebookSensitive to quality of first Codebook

• Usually computationally expensiveUsually computationally expensive

Speed-Up possible (and necessary)Speed-Up possible (and necessary)• Partial searchesPartial searches

• Fast searchesFast searches

• Better initial Codebook (i.e. PCA-Splits)Better initial Codebook (i.e. PCA-Splits)

LBG-Algorithm can be fast!LBG-Algorithm can be fast!

Page 8: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

Vector QuantizationVector Quantization

The PCA-SplitThe PCA-Split• Lensch et.al. 2001 – BRDF CompressionLensch et.al. 2001 – BRDF Compression

• Covariance analysis to find optimal splitting planeCovariance analysis to find optimal splitting plane

• Cut a cluster of input vectors in two by this plane.Cut a cluster of input vectors in two by this plane.

• Plane is given by centroid of current set and largest Plane is given by centroid of current set and largest Eigenvector (= normal) of the Auto-Covariance MatrixEigenvector (= normal) of the Auto-Covariance Matrix

Page 9: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

Vector QuantizationVector Quantization

LBG as PCA post-processingLBG as PCA post-processing• Increases fidelityIncreases fidelity

• Leads to stable Voronoi-RegionsLeads to stable Voronoi-Regions

• Only a few steps are necessaryOnly a few steps are necessary

• Great speed-up compared to LBG only!Great speed-up compared to LBG only!

A series of LBG steps, codebook from last slideA series of LBG steps, codebook from last slide

Page 10: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

ExampleExample

Full-color confocal microscopy scan, 512Full-color confocal microscopy scan, 51222x32xRGBx32xRGB

Original, 32MBOriginal, 32MB 4D vectors, 2MB4D vectors, 2MB32D vectors, 1MB32D vectors, 1MB

Page 11: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

Hierarchical Vector QuantizationHierarchical Vector Quantization

LaplaceLaplace

DecompositionDecomposition

Page 12: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

Hierarchical Vector QuantizationHierarchical Vector Quantization

4433 dim. VQ dim. VQ

223 3 dim. VQdim. VQ

Direct CopyDirect Copy

Page 13: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

Hierarchical Vector QuantizationHierarchical Vector Quantization

Output:Output:• One RGB Index-VolumeOne RGB Index-Volume

• Two CodebooksTwo Codebooks

RGB Index-Volume RGB Index-Volume 3D Texture 3D Texture

Codebooks Codebooks 2D 2D -Textures-Textures

Page 14: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

ExampleExample

Visible Human (Male), RGB slice 2048x1216Visible Human (Male), RGB slice 2048x1216

Compression took 10.0 seconds, PSNR = 34.72dBCompression took 10.0 seconds, PSNR = 34.72dB

Original (7.1MB) Compressed (285KB)

Page 15: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

TimingsTimings

Reference System: P4 2.8GHz, 1GB memoryReference System: P4 2.8GHz, 1GB memory

VHP Slice, 2048x1216 RGBVHP Slice, 2048x1216 RGB 10.0 sec10.0 sec

Engine 256Engine 25622x128 CT-Scanx128 CT-Scan 19.0 sec19.0 sec

Skull 256Skull 25633 CT-Scan CT-Scan 50.6 sec50.6 sec

Vortex Sequence, 128Vortex Sequence, 12833x100x100 13 (5) min13 (5) min

Shockwave Sequence, 256Shockwave Sequence, 25633x89x89 29 (13) min29 (13) min

Page 16: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

RenderingRendering

GPU-based decodingGPU-based decoding• Indices stored in 3D RGB-texture (3/64th original size)Indices stored in 3D RGB-texture (3/64th original size)

• Decode index per block Decode index per block dependent fetchdependent fetch

• Decode adress per block Decode adress per block 4433 adress texture adress texture

Decoding process in flatlandDecoding process in flatland

Page 17: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

RenderingRendering

Render 3D index and adress textureRender 3D index and adress texture• Nearest neighbor interpolation for bothNearest neighbor interpolation for both

• GL_REPEAT for adress textureGL_REPEAT for adress texture

Per-fragment decodingPer-fragment decoding• Decode detail components and dependent fetchDecode detail components and dependent fetch

• Add the details to average component (Red channel)Add the details to average component (Red channel)

• Lookup result in 1D RGBLookup result in 1D RGB transfer function transfer function

Problem:Problem:

Complex fragment shader slows down renderingComplex fragment shader slows down rendering

Page 18: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

RenderingRendering

Solution:Solution: Deferred Fragment ProcessingDeferred Fragment Processing

Avoid decoding in empty regions. „Empty“ means:Avoid decoding in empty regions. „Empty“ means:

a) a) -Transfer function maps 0 -Transfer function maps 0 0. 0.• Check on CPUCheck on CPU

• Switch between two possible rendering modesSwitch between two possible rendering modes

b) Average value is 0 (Red channel)b) Average value is 0 (Red channel)• Check in a first, simple fragment programCheck in a first, simple fragment program

• Fragment‘s depth value is set accordinglyFragment‘s depth value is set accordingly

• Second pass: discard (early Z-Test) or render fragmentSecond pass: discard (early Z-Test) or render fragment

• Full decoding only performed in second passFull decoding only performed in second pass

Page 19: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

25625622x128 Engine CT Scanx128 Engine CT Scan

19.0 seconds, PSNR = 36.17dB (P4 2.8GHz)19.0 seconds, PSNR = 36.17dB (P4 2.8GHz)

Original (8MB) – 19 fps Compressed (402KB) – 12 fps

Page 20: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

25625633 Skull CT Scan Skull CT Scan

50.6 seconds, PSNR = 35.35dB (P4 2.8GHz)50.6 seconds, PSNR = 35.35dB (P4 2.8GHz)

Original (16MB) – 14 fps Compressed (780KB) – 11 fps

Page 21: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

Time-resolved SequencesTime-resolved SequencesExploit temporal coherences during compression:Exploit temporal coherences during compression:

• Group of Frames (GOF)Group of Frames (GOF)

First frame in a GOF:First frame in a GOF:• PCA-Split followed by LBG-RefinementPCA-Split followed by LBG-Refinement

Other frames:Other frames:• LBG-refinement of last Index-Volume and CodebookLBG-refinement of last Index-Volume and Codebook

Result:Result:• Great speed-up (factor 2 to 3)Great speed-up (factor 2 to 3)

• Very large GOFs possible (64+ frames)Very large GOFs possible (64+ frames)

• Virtually same fidelity as frame-by-frameVirtually same fidelity as frame-by-frame

Page 22: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

12812833x100 Vortex-Simulationx100 Vortex-Simulation

5 minutes, PSNR = 34.43dB (P4 2.8 GHz)5 minutes, PSNR = 34.43dB (P4 2.8 GHz)

Original (200MB) - 28 fps Compressed (11MB) - 16 fps

Page 23: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

25625633x89 Shockwave-Sequencex89 Shockwave-Sequence

13 minutes, PSNR = 51.36dB (P4 2.8 GHz)13 minutes, PSNR = 51.36dB (P4 2.8 GHz)

Original (1.4GB) - 20 fps Compressed (70MB) - 24 fps

Page 24: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

ConclusionsConclusions

• Compression ratios of approx. 20:1Compression ratios of approx. 20:1

• Interactive rendering possibleInteractive rendering possible

• Easy random access to each frameEasy random access to each frame

• Wide variety of data sets handledWide variety of data sets handled

• Currently only nearest neighbor interpolationCurrently only nearest neighbor interpolation• Mainly limited by performance / instruction count.Mainly limited by performance / instruction count.

• Tri-linear interpolation can be done on newer GPUs!Tri-linear interpolation can be done on newer GPUs!

Page 25: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

Online DemoOnline Demo

Shockwave sequenceShockwave sequence

Vortex sequenceVortex sequence

Page 26: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

Typical MPEG Decoding PipelineTypical MPEG Decoding Pipeline

The Future ?The Future ?

CPUCPU

Video ChipVideo Chip

MPEG StreamMPEG Stream

De-QuantisationDe-Quantisation

Motion CompensationMotion Compensation

Inverse DCTInverse DCT

Colorspace ConversionColorspace Conversion

Predictor /Predictor /

CorrectorCorrector

methodmethod

FurtherFurther

compressioncompression

opportunitiesopportunities

Page 27: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

The Future ?The Future ?

Merge with OpenGL APIMerge with OpenGL API

MPEG StreamMPEG Stream

De-QuantisationDe-Quantisation

Motion CompensationMotion Compensation

Inverse DCTInverse DCT

Colorspace ConversionColorspace Conversion

P- / Super-Buffer BlitP- / Super-Buffer Blit

Bind as TextureBind as Texture

Fragment ProcessingFragment Processing

XvMCXvMC

Page 28: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

XvMCXvMC

Extension to X-ServerExtension to X-Server

Already supported on: Already supported on: • GeForce 4 MX / GeForce FX (full)GeForce 4 MX / GeForce FX (full)

• Other GeForces (no iDCT)Other GeForces (no iDCT)

Driver-CodeDriver-Code• No OpenSourceNo OpenSource

• Other vendors working on implementationOther vendors working on implementation

Specification: Mark Vojkovich, XFree ProjectSpecification: Mark Vojkovich, XFree Project

Good Performance !Good Performance !

Page 29: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

Other PossibilitiesOther Possibilities

Super-Buffer / „Über-Buffer“Super-Buffer / „Über-Buffer“• OpenGL extensionOpenGL extension

• Basically allows malloc() on video RAMBasically allows malloc() on video RAM

• Beta implementation availableBeta implementation available

Might be used to merge video and OpenGL pipes!Might be used to merge video and OpenGL pipes!• More OS IndependenceMore OS Independence

• More hardware IndependenceMore hardware Independence

• Easier to implementEasier to implement

• Only on newer GPUsOnly on newer GPUs

Some research still necessary!Some research still necessary!

Page 30: Data Compression for  Hardware-accelerated Volume Rendering

computer graphics & computer graphics & visualizationvisualization

Thank You!Thank You!

Questions ?Questions ?