Richard Thomson DAZ 3D . Direct3D 11 CTP in November 2008 DirectX SDK Vista (and beyond) only, not...
-
Upload
paul-foskett -
Category
Documents
-
view
223 -
download
5
Transcript of Richard Thomson DAZ 3D . Direct3D 11 CTP in November 2008 DirectX SDK Vista (and beyond) only, not...
DIRECT3D 11 PREVIEW
UTAH CODE CAMPFALL 2008
Richard Thomson
DAZ 3D
www.daz3d.com
Direct3D 11
CTP in November 2008 DirectX SDK
Vista (and beyond) only, not on XP
Evolution of Direct3D 10
Compatible with D3D 10 cards
Evolution of Direct3D
Direct3D 9Stable, been around for a whileLast version to be deployed on Win XP
Direct3D 10First Vista-only versionBig change from D3D 9
Direct3D 10.1Incremental tweak to D3D 10
Direct3D 10/10.1/11 vs. 9 Enumeration factored out to DXGI Same DXGI used for 10, 10.1 and 11 Divide render/texture states into chunks Chunks of state are immutable objects “Device state” consists of set of
assigned state chunks Introduces new shader stages beyond
vertex and pixel shaders Tighter API specification => no CAPS
Direct3D 11 Focus
Scalability and performance
Improving the development experience
Extending the reach of the GPU
Direct3D 11 New Features Tessellation Compute Shader Multithreading Shader Subroutines Improved Texture Compression Other Features
Tessellation
Direct3D 10 pipeline
Plus
Three new stages for Tessellation
Input Assembler
Vertex Shader
Pixel Shader
Hull Shader
Rasterizer
Output Merger
Tessellator
Domain Shader
Geometry Shader Stream Output
Hull Shader
Hull Shader
Tessellator
Domain Shader
HS output:Patch control pts afterBasis conversion
HS output:• TessFactors (how much to tessellate) • fixed tessellator mode declarations
HS input: patch control pts One Hull Shader
invocation per patch
Hull Shader Syntax
[patchsize(12)][patchconstantfunc(MyPatchConstantFunc)]MyOutPoint main(uint Id : SV_ControlPointID, InputPatch<MyInPoint, 12> InPts){ MyOutPoint result; …
result = TransformControlPoint( InPts[Id] );
return result;}
Tessellator
Tessellator
Domain Shader
Hull Shader
TS input:• TessFactors (how much to tessellate)• fixed tessellator mode declarations
TS output:• U V {W} domain points
TS output: • topology(to primitive assembly)
Note: Tessellator does not see control points
Tessellator operates per patch
Domain Shader
Domain Shader
Hull Shader
Tessellator
DS input:• U V {W} domain points
DS input:• control points• TessFactors
DS output:• one vertex
One Domain Shader invocation per point from Tessellator
Domain Shader Syntax
void main( out MyDSOutput result, float2 myInputUV : SV_DomainPoint, MyDSInput DSInputs, OutputPatch<MyOutPoint, 12> ControlPts, MyTessFactors tessFactors ){ …
result.Position = EvaluateSurfaceUV( ControlPoints, myInputUV );}
Single Pass Example
displacementmap
Evaluate surface
includingdisplacement
domain shader
patchcontrol points
Animate/skinControlPoints
transformedcontrol points
vertex shader
Transform basis,Determine how
much to tessellate
control pointsin Bezier patch
U V {W} domain points
Sub-D Patch Bezier Patch
hull shader
Tess Factors Tessellate!
tessellator
Current Authoring Pipeline(Rocket Frog Taken From Loop &Schaefer, "Approximating Catmull-Clark Subdivision Surfaces with Bicubic Patches“)
Sub-D Modeling Animation Displacement Map
Polygon Mesh Generate LODs
New Authoring Pipeline(Rocket Frog Taken From Loop &Schaefer, "Approximating Catmull-Clark Subdivision Surfaces with Bicubic Patches“)
Sub-D Modeling Animation Displacement Map
Optimally Tessellated Mesh
GPU
Tessellation Summary Helps us get closer to eliminating “pointy heads” Scales visual quality across PC hardware
configurations Supports performance increases
Coarse model = compression, faster I/0 to GPU Rendering tailored to each end user’s hardware
Better cross-platform (Windows + Xbox 360) development experience Xbox 360 has a subset of D3D11’s tessellation Parity = ease of cross-platform development Extra features = innovation for Windows gaming
Render content as the artist created it!
More on Tessellation
GameFest 2008 Slides and Audio“Direct3D 11 Tessellation”
○ Kev Gee, Microsoft
“Advanced Topics in GPU Tessellation”○ Natasha Tatarchuk, AMD/ATI
“Water-Tight, Textured, Displaced Subdivision Surface Tessellation Using Direct3D 11”○ Ignacio Castano, NVIDIA
General Purpose GPU
Data Parallel Computing GPU performance continues to grow Many applications scale well to massive
parallelism without tricky code changes Direct3D is the API for talking to GPU How do we expand Direct3D to GPGPU?
Compute Shader
Direct3D 10 pipeline
Plus
Three new stages for Tessellation
Plus
Compute Shader
Input Assembler
Vertex Shader
Pixel Shader
Hull Shader
Rasterizer
Output Merger
Tessellator
Domain Shader
Geometry Shader Stream Output
Compute ShaderData Structure
Integrated with Direct3D
Fully supports all Direct3D resources Targets graphics/media data types Evolution of DirectX HLSL Graphics pipeline updated to emit
general data structures… …which can then be manipulated by
compute shader… And then rendered by Direct3D again
Target Applications
Image/Post processing:Image ReductionImage HistogramImage ConvolutionImage FFT
A-Buffer/OIT Ray-tracing, radiosity, etc. Physics AI
Computing a Histogram
Histogram(){ shared int Histograms[16][256]; // array of 16
float3 vPixel = load( sampler, sv_ThreadID ); float fLuminance = dot( vPixel, LUM_VECTOR ); int iBin = fLuminance*255.0f;
// compute bin to increment int iHist = sv_ThreadIDInGroup & 16; // use thread index Histograms[iHist][iBin] += 1; // update bin
// enable all threads in group to complete SynchronizeThreadGroup;
Computing a Histogram 2
// Write register histograms out to memory: iBin = sv_ThreadIDInGroup.x; if (sv_ThreadID.x < 256) { for (iHist = 0; iHist < 16; iHist++) { int2 destAddr = int2(iHist, iBin); OutputResource.add(destAddr, Histograms[iHist][iBin]); // atomic } }}
Compute Shader Summary Enables much more general algorithms Transparent parallel processing model Full cross-vendor support Broadest possible installed base
GameFest 2008:“Direct3D 11 Compute Shader – More
Generality for Advanced Techniques”○ Chas Boyd, Microsoft
Multithreading Enables distribution across threads of
Application codeRuntimeDriver
Device: free threaded resource creation Immediate Context: your single primary device for
state & draws Deferred Contexts: your per-thread devices for state
& draws Display Lists: Recorded sequence of graphics
commands Requires a driver update
Shader Subroutines Details
Calls must be fastBinding applies to all primitives in a Draw callBinding operation must be fastNeed parameter passing mechanismNeed access to textures, samplers, etc.
AdvantagesReduce register usage in Über-shaders
○ Not worst case of all if statements
Allows specialization of subroutines
Improved Texture Compression
Why?
Existing block palette interpolations too simple
Results often rife with blocking artifacts No high dynamic range (HDR) support
New Texture Formats
BC6 (aka BC6H)High dynamic range6:1 compression (16 bpc RGB)Targeting high (not lossless) visual quality
BC7LDR with alpha 3:1 compression for RGB or 4:1 for RGBAHigh visual quality
Compression of New Formats Block compression (unchanged)
Each block independentFixed compression ratio
Multiple block types (new)Tailored to different types of contentSmooth gradients vs. noisy normal mapsVaried alpha vs. constant alpha
Decompression results must be bit-accurate with spec
Comparison Results 1
Orig BC3
Orig BC7
Abs Error
Comparison Results 2
Orig BC3
Orig BC7
Abs Error
Comparison Results 3
Abs ErrorHDR Original atgiven exposure
BC6 atgiven exposure
Other Features
Addressable Stream Out Draw Indirect Pull-model attribute eval Improved Gather4 Min-LOD texture clamps 16K texture limits Required 8-bit subtexel,
submip filtering precision
Conservative oDepth 2 GB Resources Geometry shader instance
programming model Optional double support Read-only depth or stencil
views
Thanks
Allison KleinSenior Lead Program ManagerDirect3DMicrosoft
Chas. BoydArchitectWindows Desktop & Gaming TechnologyMicrosoft
Thank you to our Sponsors!