ECS 298 Photorealistic Image Synthesis Rendering Acceleration Brian Budge Center for Image...

ECS 298 Photorealistic Image Synthesis

Rendering Acceleration

Brian BudgeCenter for Image Processing and

Integrated ComputingComputer Science Department

University of California, Davis

Hardware Flow VisualizationUniversity of California, Davis

Visualization and Graphics Research Group

CIPICCIPIC

Overview

• Why do we need acceleration?

• Box/Bounding Box intersection

• Sub-linear acceleration via bounding volume hierarchies

• Sub-linear acceleration via space partitioning

• Mailboxing

• Common code optimizations



CIPICCIPIC

Acceleration: Why do we need it?

• The methods we use take a long time

• It can take several hours or even days to compute a realistic image

• Brute force is pretty stupid



CIPICCIPIC

Meet the bounding box, your new best friend

• We need a bounding volume

• Two come to mind immediately – sphere and box

• Why these? Because the intersection tests are fast. Only need a boolean test – do we hit the bounding volume?

• Why choose one over the other? – Need to try to bound objects closely

• Obviously, spheres bound spheres best and boxes bound boxes best

• Worst case: polygon. Boxes much better for polygons

• Oh, I should mention, we’re talking about axis aligned boxes



CIPICCIPIC

Bounding box intersection test• The best current bbox test is one based off of Brian Smitt’s

work, by Amy Williams

• Makes use of IEEE floating point rules to eliminate redundant conditionals

• Basically it’s three slab intersections, one in X, one in Y, and one in Z

• If the parametric value t is valid for all three slab intersections, there is an intersection

• Especially if you end up using a BVH, you should get her code (it’s online – I’ll provide the URL)



CIPICCIPIC

Bounding Volume Hierarchies (BVH)

• This is probably the easiest acceleration structure to write

• Sometimes it even works okay when it’s broken

• Starts being faster than naïve testing after just a handful of objects

• Like I said before, sphere’s are okay, but I’d recommend axis aligned bounding boxes

• Let’s look at a 2D example



CIPICCIPIC



CIPICCIPIC

Notes…• We can see problems here already

– Overlap is a potential killer

• Can get better or worse builds of the same data– Alternate x,y,z

– Each alteration choose best x,y,z

– Try some magical function to optimize the whole thing

– Also, can try different levels of termination



CIPICCIPIC

Spatial subdivision• Lots of data structures for spatial subdivision

• Uniform Grid (and Hierarchical Grid)

• Octree

• BSP Tree

• Kd Tree



CIPICCIPIC

Uniform Grid• Uniform Grids are the best data structure for:

– Uniformly spread objects

– Small self-contained objects

• Example of a good scene: Millions of little spheres randomly distributed in a cube

• Example of a bad scene: A room with 4 walls, a floor and a ceiling



CIPICCIPIC

Building a uniform grid• Easiest way is to make bounding boxes for each object, and

place the object in all grid voxels that overlap with the bounding box

• Decide the grid layout by– Approximating the number of objects in each direction

• Shirley has a reasonable heuristic in his book (also has some code)

• Want to try to have few object in each voxel, but also want few empty voxels



CIPICCIPIC

Traversing a uniform grid• Algorithm called 3D-DDA

• Can end up being painful if there are lots of objects in each voxel

• Seidel suggests using a hierarchical grid 2 levels deep to eliminate a bunch of this



CIPICCIPIC

Octrees• I could probably have many of you explain this

• Octrees are better than uniform grids for many scenes

• Not so much of a dependency on uniformity

• Traversal isn’t as fast, but there could be a lot less to traverse (making it faster)



CIPICCIPIC

Building an Octree• Start with a bounding box of all your objects

• If there are enough objects inside, split space into 8 uniformly sized voxels

• Recursively descend and decide if each of your new voxels should be split into 8

• Could be split into 8 non-uniform voxels, but this is complicated, and traversal would be much harder

• Again, you’d probably use bounding boxes for each object to decide if the object is within a voxel



CIPICCIPIC

Traversing an Octree• Much more time consuming than uniform grid

• Lots of ways to do it

• What is most efficient? Not sure. Quite honestly, I haven’t implemented this

• See ray tracing news for a HUGE overview of this topic



CIPICCIPIC

BSP Trees• BSP trees are binary space partitioning trees

• They do exactly what they say: split space into two pieces

• General BSP trees can have arbitrarily aligned polygons splitting up space

• This allows for the best possible linear split of space (could do better if we allowed splitting space by trimmed NURBS or something)



CIPICCIPIC

BSP Trees (continued)• There are problems with generalized BSP trees that cause most

people not to want to use them

• It is probably an NP-complete problem to split up space in the best possible case

• That’s okay – for other data structures, same thing

• But the problem is that even finding good heuristics for good partitioning is difficult

• (Note that BSP trees work great if all you have is triangles – good algorithms for splitting triangles for best-case partitioning exist)



CIPICCIPIC



CIPICCIPIC

Kd-trees• Special case of BSP tree – axis aligned

• Kd-trees seem to be one of the best data structures for ray tracing

• Traversal is extremely fast – testing intersection against the plane is trivial as it is solving a parametric equation for one variable

• More truly logarithmic than bounding volume hierarchy, since there is early out



CIPICCIPIC

Building Kd-trees• This part is the pain in the butt

• Getting a good partitioning can be hard

• Note that you don’t want your partition to go through many objects, since those need to be duplicated on both sides of the split

• Many heuristics out there for building one of these. I’m not sure which is best, but maybe we should have a look at Havran’s work (best efficiency scheme project)



CIPICCIPIC

Which one is best?• They all have their benefits

• I believe that in the general case where things will tend to be clustered, the trees win out

• Seems that if you can get a good partitioning, kd-trees are the best because of their easy fast traversal (and did I mention low memory overhead?)

• Different scenes will be good with different schemes



CIPICCIPIC

Which one is best? (continued)• We could implement them all, and try them on all scenes ;-)

Yeah right

• Implement a few and know when they work well

• The cool part is that you can treat an acceleration scheme the same as an object

• This makes it easy to nest acceleration schemes, which is very very good



CIPICCIPIC

Mailboxing• So when you’re traversing spatial partitioning data structures,

you might possibly intersect the same object over and over (if it lies in more than one voxel)

• Mailboxing keeps track of this. When you encounter an object, check to see if you’ve already hit it at some point!

• Doing this cleverly can reduce running time pretty dramatically in real scenes where overlap is common



CIPICCIPIC

Mailboxing (continued)• I’m not a huge fan of mailboxing though

• Reason being that it is hard to parallelize

• Well, not hard really – too big. You’re keeping around to much data for each ray. Just too heavyweight

• Possible research topic: Some kind of caching scheme. It would be a lot of toying around with cache sizes and replacement schemes



CIPICCIPIC

Other optimizations• One that I use is to keep track of the interval where the ray is

valid

• If you hit an object at t=20, there’s no point searching past that point!

• Likewise, you can avoid self intersection by having an interval (epsilon,infinity)

• Can use a special memory allocator (preferably one that is already written and tested )



CIPICCIPIC

Code optimization• Remember Knuth: Premature optimization is evil!

• However, in bottleneck situations, it can be extremely important

• Remember that 90% of the time is spent in 10% of the code

• Optimize only what you need to for the greatest benefit!



CIPICCIPIC

Code optimization (continued)• Remember that conditionals are not the same as arithmetic

operations

• They are expensive if branch prediction is incorrect!

• So try to avoid them (can’t really do this that much)

• Also place the one that is most likely in the if condition, since that is what Intel chooses without other information



CIPICCIPIC

Code optimization (continued)• Early exit is usually a good idea

• Divides, sqrt, and trig functions are expensive

• If you’re dividing by the same value multiple times, do it once and multiply instead (could be hundreds of times faster)

• I use that one everywhere

• If possible delay expensive computations like sqrt until after a conditional – Example is when doing the quadratic equation



CIPICCIPIC

References• Ray tracing news, specifically

– http://www.acm.org/tog/resources/RTNews/html/rtnv12n2.html#art4

• Hans-Peter Seidel’s notes– http://www.mpi-sb.mpg.de/units/ag4/teaching/uebung/lecture20.pdf

• Pete Shirley’s book– Pete Shirley and Russel Morley, Realistic Ray Tracing, 2003, AK

Peters

• Amy Williams’ paper and code– http://www.cs.utah.edu/~awilliam/box/

• Havran’s Best Efficiency Project– http://sgi.felk.cvut.cz/BES/

• Matt Pharr and Greg Humphries’ book – good info on kd-trees and mailboxing– I have this in printed form that can be borrowed



CIPICCIPIC

Additional Info• I need you guys to sign up for presentations! They start

Tuesday

• It’ll be tricky to get everyone signed up for separate lectures

• Fill out the sheet after class

• Class web page is at http://graphics.cs.ucdavis.edu/~bcbudge/ecs298_2004

• Should be more filled in by tomorrow



CIPICCIPIC

Scene file format• After class, I’d like to get together and decide on a scene file format

• We will need something user editable, and yet not ridiculous for real scenes

• Mine currently is something like this:

• Camera{– Lens{

• Pinhole (or thin lens, or fisheye)• Aperture 8mm• Etc…

– }

• }• Color red {0.8,0.5,0.6,0.4,0.3,0.2,0.1,0.1,0.1,0.1}• Object{

– Type Trimesh– Location “my_mesh.tmf”– Material my_mat

• }

ECS 298 Photorealistic Image Synthesis Rendering Acceleration Brian Budge Center for Image...

Documents

Transcript of ECS 298 Photorealistic Image Synthesis Rendering Acceleration Brian Budge Center for Image...