Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University...

86
Afrigraph Tutorial B: Afrigraph Tutorial B: Interactive Ray-Tracing Interactive Ray-Tracing Ingo Wald Ingo Wald Philipp Slusallek Philipp Slusallek Saarland University Saarland University Computer Graphics Group Computer Graphics Group http://graphics.cs.uni-sb.de http://graphics.cs.uni-sb.de

Transcript of Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University...

Page 1: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph Tutorial B:Afrigraph Tutorial B:

Interactive Ray-TracingInteractive Ray-Tracing

Ingo WaldIngo WaldPhilipp SlusallekPhilipp Slusallek

Saarland UniversitySaarland University

Computer Graphics GroupComputer Graphics Group

http://graphics.cs.uni-sb.dehttp://graphics.cs.uni-sb.de

Page 2: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

For almost 20 years, researchers have argued For almost 20 years, researchers have argued that eventually, Ray-Tracing will become that eventually, Ray-Tracing will become

faster than rasterizationfaster than rasterization

Page 3: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

For almost 20 years, researchers have argued For almost 20 years, researchers have argued that eventually, Ray-Tracing will become that eventually, Ray-Tracing will become

faster than rasterizationfaster than rasterization

And nothing happened...And nothing happened...Well, almost ...Well, almost ...

Page 4: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

UNC Powerplant UNC Powerplant (12.5 Mtris(12.5 Mtris, >10 fps, >10 fps))

Page 5: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Four Power Plants Four Power Plants (50 Mtris)(50 Mtris)

Page 6: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Tutorial OverviewTutorial Overview

· IntroductionIntroduction– Introduction to Ray-TracingIntroduction to Ray-Tracing– Discussion: Ray-Tracing versus RasterizationDiscussion: Ray-Tracing versus Rasterization

· Previous WorkPrevious Work– Approximating Ray-TracingApproximating Ray-Tracing– Accelerated Ray-TracingAccelerated Ray-Tracing

· Interactive Ray-TracingInteractive Ray-Tracing on PCs on PCs– Coherent Ray-Tracing Coherent Ray-Tracing ImplementationImplementation– ComparisonsComparisons (SW / HW) (SW / HW)– Distributed RT of Massive ModelsDistributed RT of Massive Models

· Outlook: Hardware-Architectures for Ray-TracingOutlook: Hardware-Architectures for Ray-Tracing· Future Research and ConclusionsFuture Research and Conclusions

Page 7: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Tutorial OverviewTutorial Overview

· IntroductionIntroduction– Introduction to Ray-TracingIntroduction to Ray-Tracing– Discussion: Ray-Tracing versus RasterizationDiscussion: Ray-Tracing versus Rasterization

· Previous WorkPrevious Work– Approximating Ray-TracingApproximating Ray-Tracing– Accelerated Ray-TracingAccelerated Ray-Tracing

· Interactive Ray-TracingInteractive Ray-Tracing on PCs on PCs– Coherent Ray-Tracing Coherent Ray-Tracing ImplementationImplementation– ComparisonsComparisons (SW / HW) (SW / HW)– Distributed RT of Massive ModelsDistributed RT of Massive Models

· Outlook: Hardware-Architectures for Ray-TracingOutlook: Hardware-Architectures for Ray-Tracing· Future Research and ConclusionsFuture Research and Conclusions

Page 8: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Introduction to Introduction to Ray-TracingRay-Tracing

· In principle: Very simple algorithmIn principle: Very simple algorithm– For each pixelFor each pixel

• Create ray through that pixel• Cast ray into scene and find closest intersection• “Shade” ray at intersection point

– Can also shoot new rays during shading:Can also shoot new rays during shading:• Determine visibility of point lights by “shadow rays”• Compute reflected/refracted light by recursively tracing

reflection-/refraction-rays

– Basically, that´s all…Basically, that´s all…

Page 9: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Ray-Tracing Ray-Tracing AlgorithmAlgorithm

Page 10: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Introduction to Introduction to Ray-TracingRay-Tracing

· Only three main components:Only three main components:– Generating raysGenerating rays– Finding the closest intersection of a rayFinding the closest intersection of a ray

• Ray traversal• Ray-object intersection

– ShadingShading

Page 11: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Ray-GenerationRay-Generation

· Generate initial ray for each pixelGenerate initial ray for each pixel– Other camera models are trivial…Other camera models are trivial…

• Fisheye lens• Non-linear distortions/Lens effects• Motion blur, depth of field• …

· OptionsOptions– More samples for anti-aliasingMore samples for anti-aliasing– Adaptive SamplingAdaptive Sampling– Combine with IBRCombine with IBR

• E.g. „RenderCache”: Reuse samples by reprojection

Page 12: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Ray-TraversalRay-Traversal

· Need to find objectsNeed to find objects quicklyquickly– “Exhaustive” search infeasibleinfeasible

· Build spatial index structureBuild spatial index structure– Grid, octree, BSP-tree, BVH, ...Grid, octree, BSP-tree, BVH, ...

· AdvantagesAdvantages– Logarithmic complexityLogarithmic complexity– Occlusion cullingOcclusion culling

• “Early ray termination”

· ProblemsProblems– Multiple intersection computationsMultiple intersection computations

(objects often in multiple voxels)

– Dynamic scenes ?Dynamic scenes ?

Grid (2D)

Octree (2D)

Page 13: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Ray-Object-Ray-Object-IntersectionIntersection

· Need to compute intersectionsNeed to compute intersections fastfast– Requires many floating point operationsRequires many floating point operations

• But typically dominated by traversal (2:1)

– Plenty of algorithmsPlenty of algorithms• Plenty of primitives• Even for triangles

· OptimizationsOptimizations– Use SIMD CPU-extensions (SSE, AltiVec, 3D-Now) Use SIMD CPU-extensions (SSE, AltiVec, 3D-Now)

• Data parallel execution

– Proper caching of dataProper caching of data

Page 14: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

ShadingShading

· Lots of reflection models possibleLots of reflection models possible– Phong, Cook-Torrance, Ward, …Phong, Cook-Torrance, Ward, …– Direct use of Shading Languages (Renderman)Direct use of Shading Languages (Renderman)

· Shading Shading afterafter visibility has been computed visibility has been computed– No overhead due to overdrawNo overhead due to overdraw– Every ray is shaded Every ray is shaded exactly onceexactly once

· Can generate new raysCan generate new rays– Shadow, reflection, transmission, ...Shadow, reflection, transmission, ...

• Need to deal with recursion• Rendering cost linear in #rays traced

Page 15: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Introduction to Introduction to Ray-TracingRay-Tracing

· Only three main components:Only three main components:– Generating raysGenerating rays– Finding the closest intersection of a rayFinding the closest intersection of a ray

• Ray traversal• Ray-object intersection

– ShadingShading

· Problem:Problem:– ““Find closest intersection” is Find closest intersection” is very very expensiveexpensive– And: Lots of rays per image …And: Lots of rays per image …

Page 16: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

RasterizatiRasterizationon

Pipeline PipelineIn Contrast: RasterizationIn Contrast: Rasterization

· Efficient HW implementationEfficient HW implementation– Use of object coherenceUse of object coherence– Many new featuresMany new features

· Rendering is driven by App.Rendering is driven by App.– Application submits geometryApplication submits geometry

· Visibility determined at endVisibility determined at end– Z-buffer fragment testZ-buffer fragment test

Application

T&L, Vertex Ops

Rasterization

Texturing

Fragment Ops

Fragment Tests

Framebuffer

Page 17: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

RasterizationRasterizationDrawbacksDrawbacks

Drawbacks of this approachDrawbacks of this approach

· Use of object coherenceUse of object coherence– Only if triangle is largeOnly if triangle is large

· Rendering is driven by App.Rendering is driven by App.– Application has to know what is visible…Application has to know what is visible…– Efficient occlusion culling is hardEfficient occlusion culling is hard

· Visibility determined at endVisibility determined at end– Overdraw: Discard all but one fragmentsOverdraw: Discard all but one fragmentsHigh depth complexity: very inefficientHigh depth complexity: very inefficient

Page 18: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Ray-Tracing versus Ray-Tracing versus RasterizationRasterization

· FlexibilityFlexibility– Handling unstructured groups of raysHandling unstructured groups of rays

• Image-based rendering, reflections, shadows …

· GeneralityGenerality– Ray-Tracing is the basis for many algorithmsRay-Tracing is the basis for many algorithms

• Global illumination, visibility, …

– Used in many disciplinesUsed in many disciplines• Physics, Biology, Chemistry, Telecom, …

Page 19: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Ray-Tracing versus Ray-Tracing versus RasterizationRasterization

· Simple and Efficient ShadingSimple and Efficient Shading– Shading happens after visibility computationShading happens after visibility computation– Direct use of Shading LanguagesDirect use of Shading Languages

· Correctness & Image QualityCorrectness & Image Quality– Rasterization inherently relies on approximationsRasterization inherently relies on approximations

• Environment maps, shadow maps, ...

– Ray-traced images are “correct” by defaultRay-traced images are “correct” by default• ´True´ reflections and shadows…• Use of approximations is optional

Page 20: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Ray-Tracing versus Ray-Tracing versus RasterizationRasterization

· Parallel ScalabilityParallel Scalability– Ray-Tracing is „embarrassingly parallel“Ray-Tracing is „embarrassingly parallel“

(e.g. each pixel independent of all others)

– Scales well with Scales well with the available the available hardwarehardware– Needs fast access to scene data baseNeeds fast access to scene data base

Page 21: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Ray-Tracing versus Ray-Tracing versus RasterizationRasterization

· Scalability with Scene Size:Scalability with Scene Size:

Occlusion Culling & Logarithmic ComplexityOcclusion Culling & Logarithmic Complexity– RT never even looks at invisible geometryRT never even looks at invisible geometry– RT traversal allows for efficient searching: RT traversal allows for efficient searching: O(log N)O(log N)

– Rasterization shows linear behavior: Rasterization shows linear behavior: O(N)O(N)

RT wins for complex scenesRT wins for complex scenes– But rasterization is improvingBut rasterization is improving

Page 22: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Ray-Tracing versus Ray-Tracing versus RasterizationRasterization

· CoherenceCoherence– Key to efficient renderingKey to efficient rendering– Rasterization: Rasterization: Object coherenceObject coherence

• Allows for efficient HW implementation• But only really efficient for large triangles

– Ray-Tracing: Ray-Tracing: Ray coherenceRay coherence• Improved caching & reduced bandwidth• Allows for data parallel computation

– RT has much more coherence than assumedRT has much more coherence than assumed• But harder to exploit…

Page 23: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Ray-Tracing versus Ray-Tracing versus RasterizationRasterization

· Conclusion of that ComparisonConclusion of that Comparison– Ray Tracing has Ray Tracing has manymany advantages advantages

• These advantages become ever more pronounced• Not only qualty, also efficiency…

– But: Ray-Tracing is (still) costlyBut: Ray-Tracing is (still) costly• Have to make it faster !

Page 24: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Tutorial OverviewTutorial Overview

· IntroductionIntroduction– Introduction to Ray-TracingIntroduction to Ray-Tracing– Discussion: Ray-Tracing versus RasterizationDiscussion: Ray-Tracing versus Rasterization

· Previous WorkPrevious Work– Approximating Ray-TracingApproximating Ray-Tracing– Accelerated Ray-TracingAccelerated Ray-Tracing

· Interactive Ray-TracingInteractive Ray-Tracing on PCs on PCs– Coherent Ray-Tracing Coherent Ray-Tracing ImplementationImplementation– ComparisonsComparisons (SW / HW) (SW / HW)– Distributed RT of Massive ModelsDistributed RT of Massive Models

· Outlook: Hardware-Architectures for Ray-TracingOutlook: Hardware-Architectures for Ray-Tracing· Future Research and ConclusionsFuture Research and Conclusions

Page 25: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Previous and Previous and Related WorkRelated Work

Two ways to achieve ray-tracing like quality interactively:Two ways to achieve ray-tracing like quality interactively:

· Trace less rays per frame: “Approximative ray-tracing”Trace less rays per frame: “Approximative ray-tracing”– Rasterization hardwareRasterization hardware– Image-based techniquesImage-based techniques– Interpolation of ray-traced resultsInterpolation of ray-traced results

· Trace more rays/sec: “Accelerated ray-tracing”Trace more rays/sec: “Accelerated ray-tracing”– Better data structuresBetter data structures– Better algorithmsBetter algorithms– Better implementationsBetter implementations– Parallel processingParallel processing

Page 26: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Previous and Previous and Related WorkRelated Work

Two ways to achieve ray-tracing like quality interactively:Two ways to achieve ray-tracing like quality interactively:

· Trace less rays per frame: “Approximative ray-tracing”Trace less rays per frame: “Approximative ray-tracing”– Rasterization hardwareRasterization hardware– Image-based techniquesImage-based techniques– Interpolation of ray-traced resultsInterpolation of ray-traced results

· Trace more rays/sec: “Accelerated ray-tracing”Trace more rays/sec: “Accelerated ray-tracing”– Better data structuresBetter data structures– Better algorithmsBetter algorithms– Better implementationsBetter implementations– Parallel processingParallel processing

Page 27: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Approximated Ray-Approximated Ray-Tracing:Tracing:

Rasterization HardwareRasterization Hardware· „„HW-Accelerated“ vista/shadow buffersHW-Accelerated“ vista/shadow buffers

– Compute visible geometry in HWCompute visible geometry in HW• Lookup of geometry in frame buffer

– Only works for primary rays and point lightsOnly works for primary rays and point lights– Creates artifacts (e.g. shadow buffer resolution)Creates artifacts (e.g. shadow buffer resolution)

· Augmenting hardware with RT effectsAugmenting hardware with RT effects– Selective raySelective ray--tracingtracing– Integrate Integrate ray-tracing ray-tracing with OpenGLwith OpenGL rendering rendering

• Rasterization for diffuse objects• Textures or splatting [Stamminger/Haber 00/01] for ray-

traced samples

Page 28: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Approximated Ray-Approximated Ray-Tracing:Tracing:

Corrective TexturesCorrective Textures

Page 29: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Approximated Ray-Approximated Ray-Tracing:Tracing:

Image-Based Image-Based TechniquesTechniques

· RenderCache [Walter et al. 99]RenderCache [Walter et al. 99]– Store ray samples per pixel (color, depth, ...)Store ray samples per pixel (color, depth, ...)– Reproject sampleReproject sampless for next frame for next frame– Detect and fill holes by sending few new raysDetect and fill holes by sending few new rays

• Heuristic algorithms based on neighborhood

– Locate and correct errors (shadow, etc)Locate and correct errors (shadow, etc)• Pseudo-randomly sample a few other pixel• Adaptively sample near error regions

– But: Reprojection and Heuristics are expensiveBut: Reprojection and Heuristics are expensive• Pays off (only) when pixels are very expensive to

compute directly (e.g. global illumination)

– Scales badly with #CPUsScales badly with #CPUs

Page 30: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

· Holodeck [Ward 98]Holodeck [Ward 98]– Similar to RenderCache, butSimilar to RenderCache, but

• Long term storage of ray samples on disk• Fast access to samples based on grid structure

– Builds light-field-like data representationBuilds light-field-like data representation

Approximated Ray-Approximated Ray-Tracing:Tracing:

Image-Based Image-Based TechniquesTechniques

Page 31: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

· Interpolation in the image planeInterpolation in the image plane– Pixel-selected ray-tracing [Akimoto, 89]Pixel-selected ray-tracing [Akimoto, 89]

• Coarse sampling grid• Adaptive refinement based on error criteria• Linear interpolation between samples

· General General rray ay iinterpolation [Bala, 99]nterpolation [Bala, 99]– Object-/Ray-/Object-/Ray-/

Image-SpaceImage-Space– TimeTime– Error boundedError bounded

Approximated Ray-Approximated Ray-Tracing:Tracing:

Image-Based Image-Based TechniquesTechniques

Page 32: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Previous and Previous and Related WorkRelated Work

Two ways to achieve ray-tracing like quality interactively:Two ways to achieve ray-tracing like quality interactively:

· Trace less rays per frame: “Approximative ray-tracing”Trace less rays per frame: “Approximative ray-tracing”– Rasterization hardwareRasterization hardware– Image-based techniquesImage-based techniques– Interpolation of ray-traced resultsInterpolation of ray-traced results

· Trace more rays/sec: “Accelerated ray-tracing”Trace more rays/sec: “Accelerated ray-tracing”– Better data structuresBetter data structures– Better algorithmsBetter algorithms– Better implementationsBetter implementations– Parallel processingParallel processing

Page 33: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Accelerated Ray Tracing:Accelerated Ray Tracing:Better Data Better Data

Structures/AlgorithmsStructures/Algorithms

· ´Best´ data structure (Grid vs BSP vs…) ?´Best´ data structure (Grid vs BSP vs…) ?– Always scene and implementation dependentAlways scene and implementation dependent– In practice, most do about equally well…In practice, most do about equally well…– Well-reserached topic Well-reserached topic ´New´ data structures are ´New´ data structures are

unlikely to be foundunlikely to be found

· But: Potential for better algorithms:But: Potential for better algorithms:– Can we better exploit coherence ?Can we better exploit coherence ?– Can we build data structures faster ?Can we build data structures faster ?– Can we build data structures fully automatically ?Can we build data structures fully automatically ?

· Also: Need for dynamic data structuresAlso: Need for dynamic data structures

Page 34: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Accelerated Ray-Tracing:Accelerated Ray-Tracing:Parallelization on SuperComputersParallelization on SuperComputers

· RT of RT of llarge CSG models [Muuss 95]arge CSG models [Muuss 95]– Motivation: Interactively render complex data setsMotivation: Interactively render complex data sets– Idea: Use raytracingIdea: Use raytracing

• Flexibility: Avoid tessellation of CSG-models• Take advantage of logarithmic complexity of RT• Exploit parallelism

– ImplementationImplementation• Optimized, general RT algorithm• 96 CPU, SGI PowerChallenge, shared memory

– ResultsResults• 1-2 frames per second @ video resolution (in ´95!!!)

Page 35: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

· Utah Parallel RT System [Parker 99]Utah Parallel RT System [Parker 99]– Similar approach to MuussSimilar approach to Muuss

• Parallelization on shared memory machine

– Supports general primitives and volume data Supports general primitives and volume data setssets

– ResultsResults• Has shown scalability up to 128 CPUs• Importance of caching analysis• New goal: interactive visual cues for visualization

(Same information at less cost)

Accelerated Ray-Tracing:Accelerated Ray-Tracing:Parallelization on SuperComputersParallelization on SuperComputers

Page 36: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Tutorial OverviewTutorial Overview

· IntroductionIntroduction– Introduction to Ray-TracingIntroduction to Ray-Tracing– Discussion: Ray-Tracing versus RasterizationDiscussion: Ray-Tracing versus Rasterization

· Previous WorkPrevious Work– Approximating Ray-TracingApproximating Ray-Tracing– Accelerated Ray-TracingAccelerated Ray-Tracing

· Interactive Ray-TracingInteractive Ray-Tracing on PCs on PCs– Coherent Ray-Tracing Coherent Ray-Tracing ImplementationImplementation– ComparisonsComparisons (SW / HW) (SW / HW)– Distributed RT of Massive ModelsDistributed RT of Massive Models

· Outlook: Hardware-Architectures for Ray-TracingOutlook: Hardware-Architectures for Ray-Tracing· Future Research and ConclusionsFuture Research and Conclusions

Page 37: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

IRT on PC´s:IRT on PC´s:What to keep in mindWhat to keep in mind

· PC hardware has changed dramaticallyPC hardware has changed dramatically– Processors become much fasterProcessors become much faster

• But increase in ray-tracing speed is gradual

– Increasing gap between speed of CPU and Increasing gap between speed of CPU and memorymemory• But ray-tracing algorithm did not change

– SIMD extensionsSIMD extensions• Flops become increasingly cheap• But difficult to take advantage of in ray-tracing

– Fast (and cheap) networking & network of PCsFast (and cheap) networking & network of PCs• But good performance on non-shared-memory is hard• Small clusters are around everywhere…

Page 38: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

IRT on PC´s:IRT on PC´s:What to keep in mindWhat to keep in mind

· PC hardware has changed dramaticallyPC hardware has changed dramatically

Have to adapt our algorithms !Have to adapt our algorithms !– Special emphasis on Special emphasis on

• Keeping the CPU busy• Memory & Caching

(1 cache miss can cost several triangle intersections)• SIMD

– Not so important any more:Not so important any more:• Instruction count, avoiding float ops

Page 39: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

General General Optimizations: Optimizations:

CacheCacheMain memory is too slow for CPU (1:10)Main memory is too slow for CPU (1:10)

(bandwidth and latency)(bandwidth and latency)

· Keep relevant data in cachesKeep relevant data in caches– Design algorithms for cache reuse Design algorithms for cache reuse coherence coherence– Align data to cache lines (32 bytes)Align data to cache lines (32 bytes)– Separate data according to usageSeparate data according to usage

• Separate volatile from non-volatile data• Store intersection data separate from shading data

(e.g. shading normals not needed for intersection)

– Prefetch dataPrefetch data• Design algorithms to enable data access prediction

Page 40: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

General General Optimizations: Optimizations:

CacheCacheCache Reuse Example: Triangle Data Cache Reuse Example: Triangle Data

StructureStructure· Variant 1: Variant 1:

Struct Triangle { Vec3f *a,*b,*c; };Struct Triangle { Vec3f *a,*b,*c; };– Intersect() routine works on this structureIntersect() routine works on this structure– Prefetching hard (2 levels of indirection)Prefetching hard (2 levels of indirection)– Data stored in 4 different memory regionsData stored in 4 different memory regions

(1 struct + 3 vectors)

Worst case: 8 cache missesWorst case: 8 cache misses(if each of the 4 data overlaps cacheline border)

Page 41: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

General General Optimizations: Optimizations:

CacheCacheCache Reuse Example: Triangle Data Cache Reuse Example: Triangle Data

StructureStructure· Variant 2: Variant 2:

With preprocessed intersection dataWith preprocessed intersection data– All necessary data packed into 48 aligned bytesAll necessary data packed into 48 aligned bytes

(see paper)(see paper)– Con: Additional data to store (48b/triangle)Con: Additional data to store (48b/triangle)– But several advantages: But several advantages:

• At most 2 cache misses• 1 continuous memory region Trivial to prefetch

Page 42: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

General General Optimizations: Optimizations:

CacheCache

· This was only This was only oneone example: example: Similarly forSimilarly for – BSP Nodes (even more important)BSP Nodes (even more important)– Triangle listsTriangle lists– MaterialsMaterials– Shading DataShading Data– ……

Page 43: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

General Optimizations: General Optimizations: SimplificationSimplification

Today's CPUs have very long pipelinesToday's CPUs have very long pipelines· Simplify the code Simplify the code to avoid pipeline stallsto avoid pipeline stalls

– Choose simple algorithmsChoose simple algorithms• “KISS” wins…(KISS = keep it simple and stupid)• E.g. BSP-tree traversal simpler than grids• Easier to maintain and optimize (e.g. prefetching)

– Write tight inner loopsWrite tight inner loops• E.g. better caching and handling of branches

– Avoid conditionals/relative jumps in inner loops Avoid conditionals/relative jumps in inner loops • E.g. support only triangles

– Avoid memory-access stallsAvoid memory-access stalls Caching, caching, caching !!!

Page 44: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Optimization:Optimization:SIMD ExtensionsSIMD Extensions

Most CPUs provide SIMD extensionsMost CPUs provide SIMD extensionsIntel: SSE (Others: 3D-Now!, AltiVec, ...)Intel: SSE (Others: 3D-Now!, AltiVec, ...)

· Use SIMD: higher speed & lower bandwidthUse SIMD: higher speed & lower bandwidth– Up to four parallel floating point operationsUp to four parallel floating point operations

For the cost of 1 !

– Fetch data once to reduce bandwidth to cacheFetch data once to reduce bandwidth to cache• Amortize loading cost over 4 operationsFactor 4 in bandwidth reduction

– Overhead due to restricted instruction setOverhead due to restricted instruction set• E.g. no ´SSE dot product´

– Con: Programming in assembly languageCon: Programming in assembly language

Page 45: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Optimization:Optimization:SIMD ExtensionsSIMD Extensions

How to use SIMD Extensions ?How to use SIMD Extensions ?· Either: Instruction-parallelEither: Instruction-parallel

– Combine 4 computations in ´normal´ algorithmCombine 4 computations in ´normal´ algorithm– E.g. the 4 mults in a dot productE.g. the 4 mults in a dot product

· Or: Data-parallelOr: Data-parallel– Run algorithm on 4 different data in parallel Run algorithm on 4 different data in parallel – E.g. 4 independent dot productsE.g. 4 independent dot products

Page 46: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

SIMD: IntersectionSIMD: Intersection· SIMD best used in data parallel fashionSIMD best used in data parallel fashion

– Little instruction-level parallelism (in RT)Little instruction-level parallelism (in RT) Just doesn´t work…

– Data parallel: 1 ray Data parallel: 1 ray 4 triangles 4 triangles• Hard to always have four triangles ready• Data parallel traversal for 1 ray ?

– Data parallel: 4 rays Data parallel: 4 rays 1 triangle 1 triangle• Must traverse rays in parallel ray packets• Standard intersection code• Overhead for terminated rays

(E.g. 1 ray hits, 3 rays miss)

Page 47: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

SIMD: IntersectionSIMD: Intersection· Performance ResultsPerformance Results

– Comparison against already optimized C codeComparison against already optimized C code– Amortized cost for SSE codeAmortized cost for SSE code

20-36 million intersections/sec! (P-III, 800 MHz)20-36 million intersections/sec! (P-III, 800 MHz)

CC SSESSE SpeedupSpeedup

Min. CyclesMin. Cycles 7878 2222 3.53.5

Max. CyclesMax. Cycles 148148 4040 3.73.7

Page 48: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

SIMD: BSP-TraversalSIMD: BSP-Traversal

· Recursive Traversal AlgorithmRecursive Traversal Algorithm

Page 49: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

SIMD: BSP-TraversalSIMD: BSP-Traversal

· SIMD-TraversalSIMD-Traversal– Traverse four rays in parallelTraverse four rays in parallel

• Intersection with split plane & traversal decision

– Combine decisions flagsCombine decisions flags• All rays must perform the same traversal• Make sure order is consistent

– Easy to guarantee: Same ray origin or same signs of direction vector

– Avoid recursion function callsAvoid recursion function calls• Maintain stack manually

– Worst case: as bad as before…Worst case: as bad as before…

Page 50: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

SIMD: BSP-TraversalSIMD: BSP-Traversal

· Overhead of SIMD-Traversal (in %)Overhead of SIMD-Traversal (in %)

– Fixed resolution at 1024Fixed resolution at 102422 (l), fixed 2x2 packet (r) (l), fixed 2x2 packet (r)– Traversal still dominates rendering costTraversal still dominates rendering cost– Overall speedup factor: 2 to 2.3Overall speedup factor: 2 to 2.3

2x22x2 4x44x4 8x88x8 25625622 1024102422

Shirley 6Shirley 6 1.41.4 4.44.4 11.811.8 5.85.8 1.41.4

MGF officeMGF office 2.62.6 8.28.2 21.621.6 10.410.4 2.62.6

MGF conf.MGF conf. 3.23.2 10.610.6 28.228.2 12.212.2 3.23.2

Page 51: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Coherent Algorithm: Coherent Algorithm: Tracing Ray PacketsTracing Ray Packets

Many rays are very similarMany rays are very similare.g. primary and shadow rays, but others tooe.g. primary and shadow rays, but others too

· Handle rays together in packets of 4 raysHandle rays together in packets of 4 rays– Process them in lock-step (Process them in lock-step ( SIMD) SIMD)– Reorder computations to be partly breadth-firstReorder computations to be partly breadth-first– Load data once and use it for all raysLoad data once and use it for all rays

• Reduces memory bandwidth (e.g. SSE: Factor 4 !)• Increases Cache Utilization

– Coherence increases with image resolutionCoherence increases with image resolution• “more rays in same view frustum”

Page 52: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Coherent Algorithms: Coherent Algorithms: ShadingShading

· SIMD Phong-ShadingSIMD Phong-Shading– Fixed cost per imageFixed cost per image– Rearrange data from ray packetsRearrange data from ray packets

• Different depth: non-coherent shadow rays• Different materials: different shaders

– AlgorithmAlgorithm• Parallel shadow rays to light sources• SIMD shading using shadow flags

– Constant shading & texturing cost (<10%)Constant shading & texturing cost (<10%)– Procedural shading is easy (noise)Procedural shading is easy (noise)

Page 53: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Coherent Ray-Coherent Ray-Tracing: SummaryTracing: Summary

· SpeedupSpeedup– Prerequisite: Expose coherence in ray-tracing Prerequisite: Expose coherence in ray-tracing

algorithmalgorithm– Factor >5: General optimizationsFactor >5: General optimizations– Factor >2: SIMD computationsFactor >2: SIMD computations– Further optimizations are possibleFurther optimizations are possible

• Better prefetching, more efficient shading

· PerformancePerformance– 200K to 1.5M primary rays/s (800 MHz, P-III)200K to 1.5M primary rays/s (800 MHz, P-III)– Almost linear in # of reflection & shadow raysAlmost linear in # of reflection & shadow rays

Page 54: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Comparison: Comparison: Test ScenesTest Scenes

Page 55: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Comparison: Comparison: Software Ray-TracersSoftware Ray-Tracers

· Time per primary ray (1 CPU, 512Time per primary ray (1 CPU, 51222, in , in s)s)

– Main memory: RTRT 256MB, others up to 1GBMain memory: RTRT 256MB, others up to 1GB– Rayshade: Best grid resolutionRayshade: Best grid resolution

TrisTris RayshadeRayshade POVPOV RTRTRTRT FactorFactor fpsfps

MGF officeMGF office 40K40K 29.029.0 22.922.9 2.12.1 10.910.9 1.81.8

MGF confMGF conf 256K256K 36.136.1 29.629.6 2.32.3 12.812.8 1.61.6

MGF theaterMGF theater 680K680K 56.056.0 57.257.2 3.63.6 15.515.5 1.11.1

LibraryLibrary 907K907K 72.172.1 50.550.5 3.43.4 14.814.8 1.11.1

Soda, Floor5Soda, Floor5 2.5M2.5M OOMOOM OOMOOM 2.92.9 1.51.5

Soda HallSoda Hall 8M8M OOMOOM OOMOOM 4.54.5 0.80.8

Page 56: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Comparison: Comparison: OpenGL HardwareOpenGL Hardware

· Frame rate with SGI-Performer (512Frame rate with SGI-Performer (51222, fps), fps)

– HW: Octane V8, Onyx3/IR3, Geforce II GTSHW: Octane V8, Onyx3/IR3, Geforce II GTS– CPUs: Onyx: 8, nVidia: 2, CPUs: Onyx: 8, nVidia: 2, RTRT: 1RTRT: 1

TrisTris OctaneOctane OnyxOnyx nVidianVidia RTRTRTRT

MGF officeMGF office 40K40K >24>24 >36>36 12.712.7 1.81.8

MGF confMGF conf 256K256K >5>5 >10>10 5.45.4 1.61.6

MGF theaterMGF theater 680K680K 0.40.4 6-126-12 1.51.5 1.11.1

LibraryLibrary 907K907K 1.51.5 44 1.61.6 1.11.1

Soda, Floor5Soda, Floor5 2.5M2.5M 0.50.5 1.51.5 0.60.6 1.51.5

Soda HallSoda Hall 8M8M OOMOOM OOMOOM OOMOOM 0.80.8

Page 57: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Comparison: Comparison: Scaling with Scene SizeScaling with Scene Size

· Render time of subsampled terrain (spf)Render time of subsampled terrain (spf)

– Typical linear scaling of rasterization HWTypical linear scaling of rasterization HW– Worst case for RT: No occlusionWorst case for RT: No occlusion– Only 1 CPU !Only 1 CPU !

Page 58: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Demo / VideoDemo / Video

Page 59: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Distributed RT of Massive Distributed RT of Massive ModelsModels

Page 60: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Reference Model Reference Model (12.5 Mtris)(12.5 Mtris)

Page 61: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Previous WorkPrevious Work

· Rendering of Massive ModelsRendering of Massive Models [Aliaga 99] [Aliaga 99]– Framerate: 5 to 15 fps for single power plantFramerate: 5 to 15 fps for single power plant

• Needs shared-memory supercomputer (SGI)

– Framework of algorithmsFramework of algorithms• Textured-depth-meshes (96% reduction in #tris)• View-Frustum Culling & LOD (50% each)• Hierarchical occlusion maps (10%)

– Extensive preprocessing requiredExtensive preprocessing required• Entire model: ~3 weeks (estimated)• Only semi-automatic

Page 62: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Distributed RT of Distributed RT of Massive ModelsMassive Models

· Ray-Tracing Ray-Tracing and massive models just and massive models just match:match:– Logarithmic scaling in #primitivesLogarithmic scaling in #primitives

• Ideal for big models

– PreprocessingPreprocessing• Simple and fast spatial sorting, fully automatic

– Distributed computingDistributed computing• Parallel scalability to many networked computers• No scene replication

Our Approach: Use coherent ray-tracingOur Approach: Use coherent ray-tracing– Caching of scene data in networkCaching of scene data in network– Deal with network issues by reorderingDeal with network issues by reordering

Page 63: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Ray-Tracing IssuesRay-Tracing Issues

· Distributed Scene ManagementDistributed Scene Management– Several GB of scene dataSeveral GB of scene data

• File size and virtual address space (32 bit)

– Cannot use OS caching (demand paging)Cannot use OS caching (demand paging)• Cache miss will stall the entire process

– 1ms network latency = time to trace several hundred rays• Reordering would need non-blocking memory read

Need to handle cache Need to handle cache manuallymanually• No longer limited by address spaceNo longer limited by address space• Allows reordering of computationsAllows reordering of computations

• Do not wait for missing data• Continue with other rays while data is being fetched…

Page 64: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Massive Models: Massive Models: CachingCaching

· 2-Level BSP-Trees2-Level BSP-Trees– Caching based on “voxels“Caching based on “voxels“– Voxels are completely self-containedVoxels are completely self-contained

Page 65: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Structure of the Structure of the BSP-TreeBSP-Tree

Page 66: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Distribution IssuesDistribution Issues

· PreprocessingPreprocessing– Simple spatial sortingSimple spatial sorting– Need out-of-core algorithm due to model sizeNeed out-of-core algorithm due to model size– Simplistic implementation: 2.5 hoursSimplistic implementation: 2.5 hours

• Estimated with optimizations: < 30 min

· Model ServerModel Server– Single server provides all model dataSingle server provides all model data

• Potenial bottleneck

– Should be distributed as wellShould be distributed as well• At least for more than 10 clients• Trivial to implement

Page 67: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Distribution IssuesDistribution Issues

· Load BalancingLoad Balancing– Tile based (32x32 pixels)Tile based (32x32 pixels)– Demand drivenDemand driven– Avoid idle-times Avoid idle-times

• ´prefetching´ tiles• Asynchronous communication• …

· Frame-to-Frame CoherenceFrame-to-Frame Coherence– Keep rays on the same clientKeep rays on the same client

• Simple: Keep tiles on the same client• Better: Assign tiles based on reprojected pixels

– Larger effective cache sizeLarger effective cache size• Increases with number of clients

Page 68: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

ResultsResults

· SetupSetup– Seven dual Pentium-III 800-866 MHzSeven dual Pentium-III 800-866 MHz– FastEthernet (100Mbit) for normal clientsFastEthernet (100Mbit) for normal clients– GigabitEthernet only for display & model serverGigabitEthernet only for display & model server

· Performance for one Power PlantPerformance for one Power Plant– 44-5 fps without SSE optimization-5 fps without SSE optimization– Factor 2 speedup with SSEFactor 2 speedup with SSE– Almost perfect scaling from 1 to 14 CPUsAlmost perfect scaling from 1 to 14 CPUs

• Never tried any more than that

Page 69: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Animation: Animation: Framerate vs. BandwidthFramerate vs. Bandwidth

Page 70: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

SpeedupSpeedup

Page 71: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Demo / VideoDemo / Video

Page 72: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Tutorial OverviewTutorial Overview

· IntroductionIntroduction– Introduction to Ray-TracingIntroduction to Ray-Tracing– Discussion: Ray-Tracing versus RasterizationDiscussion: Ray-Tracing versus Rasterization

· Previous WorkPrevious Work– Approximating Ray-TracingApproximating Ray-Tracing– Accelerated Ray-TracingAccelerated Ray-Tracing

· Interactive Ray-TracingInteractive Ray-Tracing on PCs on PCs– Coherent Ray-Tracing Coherent Ray-Tracing ImplementationImplementation– ComparisonsComparisons (SW / HW) (SW / HW)– Distributed RT of Massive ModelsDistributed RT of Massive Models

· Outlook: Hardware-Architectures for Ray-TracingOutlook: Hardware-Architectures for Ray-Tracing· Future Research and ConclusionsFuture Research and Conclusions

Page 73: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Ray-Tracing Ray-Tracing HardwareHardware

· Summary so far:Summary so far:– RT has many technicalRT has many technical

advantagesadvantages– Better performance forBetter performance for

large scenes, (logN vs N)large scenes, (logN vs N)– Better image quality, Better image quality,

more featuresmore features– But: High initial cost onBut: High initial cost on

main CPUmain CPU

Hardware support Hardware support would helpwould help

Page 74: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Ray-Tracing Ray-Tracing Hardware:Hardware:

Why today ?Why today ? The setting has changedThe setting has changed

– ´Real´ scenes aren´t suited for rasterization any ´Real´ scenes aren´t suited for rasterization any moremore• High depth complexity• Large scenes, small triangles• Shading becomes more expensive• Demand for more features (shading, programmability)

Advantages of raytracing finally come to playAdvantages of raytracing finally come to play

– Also: Flops aren´t that expensive any moreAlso: Flops aren´t that expensive any more• Number of Gigaflops per Gforce ?

– Neither is memory…Neither is memory…

Page 75: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Ray-Tracing Ray-Tracing Hardware:Hardware:

Previous WorkPrevious Work· Over the last decade: Over the last decade:

Several research systemsSeveral research systems– Often suffered from lack of resourcesOften suffered from lack of resources

• Memory and Flops too expensive 10 years ago

· Offline-Ray-Tracing: AR250 (ART)Offline-Ray-Tracing: AR250 (ART)– Accelerated offline rendering, bandwidth Accelerated offline rendering, bandwidth

limitedlimited

· Volume-Ray-Casting systemsVolume-Ray-Casting systems– Full volume ray casting on a chipFull volume ray casting on a chip– Many, some already commercially successfulMany, some already commercially successful

Page 76: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Ray-Tracing Ray-Tracing Hardware:Hardware:

The SHARP The SHARP ArchitectureArchitecture· SHARP architecture: Tim Purcell, StanfordSHARP architecture: Tim Purcell, Stanford

– Mixed SW/HW approachMixed SW/HW approach

· Based on SmartMemories [Mai 00]Based on SmartMemories [Mai 00]– ““Multiprocessor on a Chip”Multiprocessor on a Chip”– Roughly 64 R10k, with 8GB/s (!) memory bandwith Roughly 64 R10k, with 8GB/s (!) memory bandwith

ChipTile Quad

Processor

Interconnect

16 x 8Kb SRAMQuad Network

Page 77: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Ray-Tracing Ray-Tracing Hardware:Hardware:

The SHARP The SHARP ArchitectureArchitecture

· Conclusions from SHARPConclusions from SHARP(Also see Siggraph 2001, Course 13)(Also see Siggraph 2001, Course 13)– Simple caching works very wellSimple caching works very well

• Good ray coherence

Off-chip bandwidth is minimalOff-chip bandwidth is minimal• Simple memory access design

– Performance (512x512)Performance (512x512)• Conference scene: 50 fps

– Reconfigurability allows to adapt to demandsReconfigurability allows to adapt to demands• Adapt number of shading/traversal units to scene

Page 78: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Ray-Tracing Ray-Tracing HardwareHardware

Other Other ArchitecturesArchitectures· RAYA (MERL, Siggraph 2001, Course 13)RAYA (MERL, Siggraph 2001, Course 13)

– Based on „Memory Coherent Ray-Tracing“ Based on „Memory Coherent Ray-Tracing“ [Pharr][Pharr]

· CORA (Saarbrücken)CORA (Saarbrücken)– Hardware version of Coherent RT AlgorithmHardware version of Coherent RT Algorithm– Custom-design chipCustom-design chip– Est. performance: ~30/25 fps at 1024x768Est. performance: ~30/25 fps at 1024x768

• Cruiser: 3.5 Mtris, 2 lights• BunnyQuake: 110 Ktris, 2 lights, 3 reflection levels

Page 79: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Tutorial OverviewTutorial Overview

· IntroductionIntroduction– Introduction to Ray-TracingIntroduction to Ray-Tracing– Discussion: Ray-Tracing versus RasterizationDiscussion: Ray-Tracing versus Rasterization

· Previous WorkPrevious Work– Approximating Ray-TracingApproximating Ray-Tracing– Accelerated Ray-TracingAccelerated Ray-Tracing

· Interactive Ray-TracingInteractive Ray-Tracing on PCs on PCs– Coherent Ray-Tracing Coherent Ray-Tracing ImplementationImplementation– ComparisonsComparisons (SW / HW) (SW / HW)– Distributed RT of Massive ModelsDistributed RT of Massive Models

· Outlook: Hardware-Architectures for Ray-TracingOutlook: Hardware-Architectures for Ray-Tracing· Future Research and ConclusionsFuture Research and Conclusions

Page 80: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

What you should What you should take home with take home with

you…you…· Interactive Ray Tracing Interactive Ray Tracing ISIS feasible feasible

– If importance is paid to underlying hardware…If importance is paid to underlying hardware…

· It´s not only feasible, it´s It´s not only feasible, it´s already already therethere– Not only a theoretical phantasy any more…Not only a theoretical phantasy any more…– And even on cheap PCs And even on cheap PCs

· Not only better, it can even be Not only better, it can even be fasterfaster– At least for certain applicationsAt least for certain applications

Page 81: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

The FutureThe Future

· IRT enables completely new applicationsIRT enables completely new applications– Just think what has been done OpenGLJust think what has been done OpenGL– Large scale visualization: engineering, … Large scale visualization: engineering, …

• Handling of huge models

– Interactive global illumination (?)Interactive global illumination (?)• Need to adapt algorithms to new situation

– Flexible renderingFlexible rendering• Gaze tracking and non-uniform sampling density• Image-Based or Frameless rendering

Question: What can IRT do for Question: What can IRT do for youyou??

Page 82: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Open Research Open Research ProblemsProblems

· Can we make it even faster ?Can we make it even faster ?· HardwareHardware

– What is the best HW architecture?What is the best HW architecture?

· Dynamic ScenesDynamic Scenes– Optimized rebuild or transformation of index?Optimized rebuild or transformation of index?

· APIAPI– Better alternative to OpenGL´s „push model“?Better alternative to OpenGL´s „push model“?– OpenGL not suited for Ray-TracingOpenGL not suited for Ray-Tracing

· Global IlluminationGlobal Illumination– Efficient new algorithmsEfficient new algorithms

Page 83: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

AcknowledgementsAcknowledgements

·AMDAMD–Generous support, sponsoring and collaboration Generous support, sponsoring and collaboration soon: 24-node dual-Althlon IV, 1.5GHz clustersoon: 24-node dual-Althlon IV, 1.5GHz cluster

·Presenters of the Siggraph 2001 Course 13Presenters of the Siggraph 2001 Course 13–Images, material, and informationImages, material, and information

·Tim Purcell & Pat Hanrahan (Stanford)Tim Purcell & Pat Hanrahan (Stanford)–Many discussions and ideasMany discussions and ideas

·The Max-Planck-Institute at SaarbrueckenThe Max-Planck-Institute at Saarbruecken–Collaboration and use of their Graphics HardwareCollaboration and use of their Graphics Hardware

·C. Benthin & M. Wagner & othersC. Benthin & M. Wagner & others–Work on the RT implementation and discussionsWork on the RT implementation and discussions

Page 84: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

LinksLinks

mailto://[email protected]://[email protected] any questions or comments…For any questions or comments…

http://graphics.cs.uni-sb.de/rtrthttp://graphics.cs.uni-sb.de/rtrtThe Saarland Universities RealTime RayTracing ProjectThe Saarland Universities RealTime RayTracing Project

http://graphics.cs.uni-sb.de/pub/afrigraph01http://graphics.cs.uni-sb.de/pub/afrigraph01““Tutorial Notes” (Slides, Papers)Tutorial Notes” (Slides, Papers)

http://www.openrt.dehttp://www.openrt.deThe The OpenRT OpenRT Interactive Raytracing API (not yet online)Interactive Raytracing API (not yet online)

Page 85: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

The FutureThe Future

· Applications on compute clustersApplications on compute clusters– Visualization of large modelsVisualization of large models– Previewing of animations with full shadingPreviewing of animations with full shading

· Hardware support for IRTHardware support for IRT– At least for specialized applicationsAt least for specialized applications

· Convergence between RT and TRConvergence between RT and TR– Occlusion cullingOcclusion culling– Improved shading capabilitiesImproved shading capabilities– Eventually based on the same API?Eventually based on the same API?

Page 86: Afrigraph Tutorial B: Interactive Ray-Tracing Ingo Wald Philipp Slusallek Saarland University Computer Graphics Group .

Afrigraph 2001, Capetown, ZA Tutorial on Interactive Raytracing

Open Research Open Research ProblemsProblems

Global IlluminationGlobal Illumination· New situationNew situation

– Ray-tracing bottleneck is gone (Well, almost…)Ray-tracing bottleneck is gone (Well, almost…)

· New challengesNew challenges– Need for coherenceNeed for coherence– Efficient computationsEfficient computations– Usage of view-importanceUsage of view-importance– High-degree of parallelismHigh-degree of parallelism– Small communication overheadSmall communication overhead– Interactivity !!!Interactivity !!!– Can we trade quality for speed ?Can we trade quality for speed ?