Shader generation and compilation for a programmable GPU

40
Shader generation and compilation for a programmable GPU Student: Jordi Roca Monfort Advisor: Agustín Fernández Jiménez Co-advisor: Carlos González Rodríguez

description

Shader generation and compilation for a programmable GPU. Student: Jordi Roca Monfort Advisor: Agustín Fernández Jiménez Co-advisor: Carlos González Rodríguez. Outline. Introduction. Background. Goals. Design and implementation. Conclusions. Introduction. OpenGL Application. - PowerPoint PPT Presentation

Transcript of Shader generation and compilation for a programmable GPU

Page 1: Shader generation and compilation for a programmable GPU

Shader generation and compilation for a programmable GPU

Student: Jordi Roca MonfortAdvisor: Agustín Fernández JiménezCo-advisor: Carlos González Rodríguez

Page 2: Shader generation and compilation for a programmable GPU

Outline

Introduction. Background. Goals. Design and implementation. Conclusions.

Page 3: Shader generation and compilation for a programmable GPU

Introduction

Page 4: Shader generation and compilation for a programmable GPU

ATTILA simulation framework

Vendor OpenGL API

Vendor Driver

GLInterceptorOpenGL Application

ATTILA OpenGL API

ATTILA Driver

ATTILA Simulator

OpenGL trace

Statistics

GLPlayer

Page 5: Shader generation and compilation for a programmable GPU

ATTILA Driver

ATTILA Simulator

Statistics

Simulates last generation of 3D graphics boards (programmable

GPUs)

My Work

ATTILA OpenGL API

OpenGL Application

OpenGL trace

Vendor OpenGL API

Vendor driver

GLInterceptor

GLPlayer

Extend/Complete OpenGL API to

execute recent/advanced 3D

Applications (Doom3, Unreal Tournament,

etc)

Page 6: Shader generation and compilation for a programmable GPU

Background

Page 7: Shader generation and compilation for a programmable GPU

Renderization (I) ¿What is called renderization?

Generate the pixels for a set of images/frames forming an animated scene.

Goal: compute each pixel color as fast as possible

→ determines FPS ¿Which computations are required?

Given the scene objects DB, compute the color of the projected objects in the pixel screen area.

Each pixel color depends on the scene lighting and the viewer camera position.

Page 8: Shader generation and compilation for a programmable GPU

Renderization (II)

Position

View Info

Renderization data

Geometry info

Position, Color

Lighting Info

Screen area

Page 9: Shader generation and compilation for a programmable GPU

Renderization approaches For each pixel (x,y) compute physical interaction

between the lights and objects in scene: RayTracing, Radiosity, Photon Map Very expensive pixel computation:

Global lighting (shadows, indirect reflections among objects)

Interaction between objects and lights are computed only in vertices and for each pixel (x,y) the corresponding value is approached.

Direct Rendering (3D graphics boards, 3D game consoles, etc.).

Only direct illumination from light sources (Each vertex color is independent)

Page 10: Shader generation and compilation for a programmable GPU

Direct Rendering (I)

Position

Viewer Info

Renderization data

Geometry info

Position, Color

Lighting Info

Screen area

Color interpolation

Page 11: Shader generation and compilation for a programmable GPU

Direct Rendering (II) The higher density of vertices, the more

realistic lighting. In addition, more vertices are required

to improve level of detail in surfaces. Thus:

▲realism→ ▲vertices→ ▲computation→ ▼FPS

Solution: Specify surface using less vertices and Specify surface details using textures.

Page 12: Shader generation and compilation for a programmable GPU

Textures

Renderization data

Position

Viewer Info

Geometry info

Position, Color

Lighting Info

Screen area

Textures

Page 13: Shader generation and compilation for a programmable GPU

Texture mapping

Screen area0 1

0

1(0.63,0.86)

(0.26,0.37)

(0.79,0.10)

Page 14: Shader generation and compilation for a programmable GPU

Texture mapping

Screen area0 1

0

1(0.63,0.86)

(0.26,0.37)

(0.79,0.10)

Coordinate interpolator

(0.40,0.45)Texture

sampled value

Page 15: Shader generation and compilation for a programmable GPU

3D Rendering Pipeline

Generate interpolated attributes

(color, coordinates

)

Per-pixel texture

mapping

Compute:• color• coordinates• vertex position in screen Final

screen

3D scene Vertex DB

Viewer infoLighting info Textures

Vertex processing stage(VERTEX SHADING)

Parallelizable process

Fragment processing stage

(FRAGMENT SHADING)Parallelizable process

RASTERIZER

Page 16: Shader generation and compilation for a programmable GPU

3D RP Implementation Implementations

Software: Mesa 3D Graphics Library (OpenGL).

Software + hardware acceleration: Vendor OpenGL, Direct3D, Xbox, PlayStation,

etc. Work distribution between CPU y graphics board

transparently to the applications.

Page 17: Shader generation and compilation for a programmable GPU

3D accelerators evolution 2D accelerators (pre Voodo) <1996

3D accelerators (3Dfx Voodo) 1996

Graphical Processor Units (GeForce) 1999

Programmable GPUs (GeForce 3) 2001

Rasterizer FSVSFinal

screenBD

CPU

VGA

Rasterizer FSVSFinal

screenBD

CPU

3D accelerators

Rasterizer FSVSFinal

screenBD

CPU

GPU

Rasterizer FSVSFinal

screenBD

CPU

PGPU

Page 18: Shader generation and compilation for a programmable GPU

GPUs: applying 2 textures

Rasterizer

(x,y) Interpolatedcolor

Texture coordinate 1 Final colorF1

Fragment streamTexture coordinate 2

+

Fragment Unit 0

Texture Memory

*

Fixed Functio

n

Uses:

• Per-pixel lighting.• Shadow implementation.• Bump-mapping.

Page 19: Shader generation and compilation for a programmable GPU

Programmable GPUs: 2 textures

Rasterizer

(x,y) Interpolatedcolor

Texture coordinate Final colorF1

Fragment Stream

Texture coordinate

Fragment Shader 0

Texture MemoryALU

Temporals

Shader Processor

s

LDTEX t1, coord1, Text1

LDTEX t2, cood2, Text2

ADD t1, colorIn, t1

MUL t1, t1, t2

Page 20: Shader generation and compilation for a programmable GPU

Shader Processors SP execute small programs (shaders) using

vectorial and scalar instructions, that define the computation in the following stages:

Vertex processing: Vertex Shader Lighting computation On-screen vertex projection Texture coordinates generation.

Fragment processing: Fragment Shader Texture color fetch and blending. FOG

It is like a GPU supporting “infinite visualization effects” not supported in previous graphics boards generations.

Page 21: Shader generation and compilation for a programmable GPU

Goals

Page 22: Shader generation and compilation for a programmable GPU

Goals Implement all the necessary modules in

the OpenGL API to: Support new real 3D applications using

shaders in our simulation framework. Support also for old applications using FF and

applications combining both shaders and FF.

Idea: Perform Fixed Function emulation through generating

equivalent shaders for SP.

Page 23: Shader generation and compilation for a programmable GPU

Things to do

Implement shader support in our OpenGL API: Using the most used shader

programming language by 3D apps: ARB_vertex_program y ARB_fragment_program

Study how to express FF functions in terms of shaders (pre-study phase).

Page 24: Shader generation and compilation for a programmable GPU

Design and implementation

Page 25: Shader generation and compilation for a programmable GPU

Fixed Function emulation

Page 26: Shader generation and compilation for a programmable GPU

FF Emulation

RasterizerFragment Shader

Vertex Shader

Final screenBD

!!ARBvp1.0

ATTRIB pos = vertex.position;PARAM mat[4] = { state.matrix.mvp };

# Transform by concatenation of the# MODELVIEW and PROJECTION matrices.DP4 result.position.x, mat[0], pos;DP4 result.position.y, mat[1], pos;DP4 result.position.z, mat[2], pos;DP4 result.position.w, mat[3], pos;

# Pass the primary color through # w/o lighting.MOV result.color, vertex.color;

END

!!ARBfp1.0

#first set of texture coordinatesATTRIB tex = fragment.texcoord;

# interpolated colorATTRIB col = fragment.color;

OUTPUT outColor = result.color;TEMP tmp;

#sample the textureTEX tmp, tex, texture, 2D;#perform the modulationMUL outColor, tmp, col; END

Page 27: Shader generation and compilation for a programmable GPU

FF emulation Implemented functions (according to OpenGL

Spec 2.0): Vertex Shading (85% of total):

Per-vertex standard OpenGL lighting: Point, directional and spot lights. Attenuation. Local and infinite viewer.

Vertex transformation Automatic texture coordinate generation.

Object Plane and Eye Plane Normal Map, Reflection Map and Sphere Map.

FOG coordinate. Fragment Shading (90% of total):

Multi-texturing and texture combine functions FOG application:

Linear, Exponential and Second Order Exponential

Page 28: Shader generation and compilation for a programmable GPU

FF emulation example FOG application:

Algorithm: For each pixel, perform linear interpolation between the original and the fog color, accoding to the distance from the object to the viewer.

Page 29: Shader generation and compilation for a programmable GPU

FOG emulation FOG exponential mode

f = e-density*fogcoord

f = 2-(density * fogcoord)/ln(2) (e = 21/ln 2)

Final color = pixel color * f + fog color * (1 - f)

Page 30: Shader generation and compilation for a programmable GPU

FOG emulation

!!ARBfp1.0ATTRIB fogCoord = fragment.fogcoord;OUTPUT oColor = result.color;PARAM fogColor = state.fog.color;PARAM fogParams = program.local[0]; # fogParams.x : density/ln(2)

TEMP fragmentColor, fogFactor;

# Texture applications....

# Fog Factor computing...MUL fogFactor.x, fogParam.x, fogCoord.x; # fogFactor.x = density*fogcoord/ln(2)EX2_SAT fogFactor.x, -fogFactor.x; # fogFactor.x = 2^-(fogFactor.x)

# Fog color interpolationLRP oColor, fogFactor.x, fragmentColor, fogColor;

END

Page 31: Shader generation and compilation for a programmable GPU

ARB compilers

Page 32: Shader generation and compilation for a programmable GPU

ARB compilers

!!ARBvp1.0

ATTRIB pos = vertex.position;PARAM mat[4] = { state.matrix.mvp };

# Transform by concatenation of the# MODELVIEW and PROJECTION matrices.DP4 result.position.x, mat[0], pos;DP4 result.position.y, mat[1], pos;DP4 result.position.z, mat[2], pos;DP4 result.position.w, mat[3], pos;

# Pass the primary color through # w/o lighting.MOV result.color, vertex.color;

END

!!ARBfp1.0

#first set of texture coordinatesATTRIB tex = fragment.texcoord;

# interpolated colorATTRIB col = fragment.color;

OUTPUT outColor = result.color;TEMP tmp;

#sample the textureTEX tmp, tex, texture, 2D;#perform the modulationMUL outColor, tmp, col; END

Page 33: Shader generation and compilation for a programmable GPU

The compilers common architecture

!!ARBvp1.0PARAM arr[5] = { program.env[0..4] };#ADDRESS addr;ATTRIB v1 = vertex.attrib[1];PARAM par1 = program.local[0];OUTPUT oPos = result.position;OUTPUT oCol = result.color.front.primary;OUTPUT oTex = result.texcoord[2];ARL addr.x, v1.x;MOV res, arr[addr.x - 1];END

Lexical - Syntactic Analysis

(Flex + Bison)

!!ARBvp1.0

IR

Semantic Analysis

Symboltable

Code generation

GPUSpecific

Generic

Line:By0By1By2By3By4By5By6By7By8By9ByAByBByByDByEByF 011: 16 00 03 28 00 01 00 08 26 1b 6a 00 0f 1b 04 78 012: 09 00 03 00 00 00 02 08 24 1b 1b 00 08 1b 14 18 013: 09 00 04 00 00 00 02 08 24 1b 1b 00 04 1b 14 b8 014: 09 00 05 00 00 00 02 08 24 1b 1b 00 02 1b 04 58 015: 09 00 06 00 00 00 02 08 24 1b 1b 00 01 1b 04 f8 016: 16 00 01 00 00 00 02 30 24 1b 1b 00 08 1b 14 98 017: 16 00 02 00 00 01 02 30 24 1b 1b 00 08 1b 04 38 018: 16 00 00 00 00 00 03 30 24 00 1b 00 02 1b 04 d8 019: 16 00 01 00 00 00 03 30 24 00 1b 00 01 1b 14 78 020: 01 00 08 00 00 08 18 08 24 04 ae 00 0c 1b 04 18 021: 17 00 00 00 00 00 13 30 24 00 00 00 08 1b 04 b8 022: 17 00 01 00 00 00 13 30 24 00 00 00 04 1b 14 58 023: 01 00 08 00 00 09 18 08 24 04 04 00 0c 1b 14 f8 024: 01 00 08 00 00 0a 18 08 26 04 ae 00 0c 1b 04 98 025: 01 00 08 00 00 0b 18 08 26 04 04 00 0c 1b 14 38

Page 34: Shader generation and compilation for a programmable GPU

Intermediate Representation Example:

!!ARBvp1.0

ATTRIB pos = vertex.position;PARAM mat[4] = { state.matrix.mvp };

# Transform by concatenation of the# MODELVIEW and PROJECTION matrices.DP4 result.position.x, mat[0], pos;DP4 result.position.y, mat[1], pos;DP4 result.position.z, mat[2], pos;DP4 result.position.w, mat[3], pos;

# Pass the primary color through # w/o lighting.MOV result.color, vertex.color;

END

IRProgram

header: “!!ARBvp1.0”

IRVP1ATTRIBStatement

name: posattrib: vertex.position

Program Statements

IRInstruction

opcode: DP4

destination: result.position

IRDstOperand

writeMask: xisResultRegister: true

source: mat

IRSrcOperand

swizzleMask: xyzwisInputRegister: false

destination sources

source: pos

IRSrcOperand

swizzleMask: xyzwisInputRegister: false

Page 35: Shader generation and compilation for a programmable GPU

Semantic analysis and generic code generation

Features: Implemented using the visitor pattern. Decouples IR from the different

operations involved in each compiler phase.

Allows using a common analyzer and a common code generator for both program types.

Page 36: Shader generation and compilation for a programmable GPU

Code generation Phase 1: Generate an architecture-independent

generic code assuming unbounded machine resources.

Phase 2: Translate to specific code being aware of the concrete GPU architecture constraints.

GenericInstruction

GenericCode

GenericInstruction

Machine File Descriptor

GPUInstruction

Specific Code

GPUInstruction

GPUInstruction

Page 37: Shader generation and compilation for a programmable GPU

Conclusions

Page 38: Shader generation and compilation for a programmable GPU

Conclusions Achieved goals:

Now, the OpenGL API implementation supports:

Fixed Function emulation Of almost the entire set of functions of VS and FS

stages (the most important ones).

Shader compilation for ARB_vertex_program and ARB_fragment_program specifications.

Both compilers share most of the implementation. Clear separation between generic and specific stages.

Page 39: Shader generation and compilation for a programmable GPU

Future work

Support/include other 3D RP parts (i.e. interpolation) like programables stages to reduce hardware complexity and power consumption (embedded systems).

Implement high-level shading languages compilers (GLSlang, HLSL).

Page 40: Shader generation and compilation for a programmable GPU

End of the presentation