Static Stages for Heterogeneous Programming - …asampson/media/braid... · Apple iPhone 6s...

Post on 07-Mar-2018

224 views 6 download

Transcript of Static Stages for Heterogeneous Programming - …asampson/media/braid... · Apple iPhone 6s...

Static Stagesfor Heterogeneous ProgrammingAdrian Sampson, CornellKathryn S McKinley, GoogleTodd Mytkowicz, Microsoft Research

Apple iPhone 6s Smartphone

Application Processors – Sneak Peak, as promised!

10

APL1022 TSMC 16 nm FinFET APL0898 Samsung 14 nm FinFET

NOTE: False color and image sharpening has been applied to the photos for the purposes of this article. High resolution images in Chipworks reports are not retouched.

Apple A9techinsights.com

Apple iPhone 6s Smartphone

Application Processors – Sneak Peak, as promised!

10

APL1022 TSMC 16 nm FinFET APL0898 Samsung 14 nm FinFET

NOTE: False color and image sharpening has been applied to the photos for the purposes of this article. High resolution images in Chipworks reports are not retouched.

Apple A9techinsights.com

CPUs

GPUsDSP ISP audio codecs video codecs modems

CPUs

GPUsDSP ISP audio codecs video codecs modems

Mobile SoCs

Microsoft Catapult

Google TPU

Datacenter Servers

accelerator A

accelerator Caccelerator B

CPU

C++ program

program

program

program

unified program

CPU code

accelerator A code

accelerator B code accelerator C code

Heterogeneous programming languagesneed support for placement and specialization.

With extensions, multi-stage programmingcan support both concepts.

Current APIs for real-time graphics are especiallyunsafe, verbose, and brittle. We can help.

!<[]>

Heterogeneous programming languagesneed support for placement and specialization.

With extensions, multi-stage programmingcan support both concepts.

Current APIs for real-time graphics are especiallyunsafe, verbose, and brittle. We can help.

!<[]>

CPU GPUCommands Pixels

Display

CPU

Rendering Pipelineprogrammable & fixed-function stages

GPU Display

vertex positions pixel colors

VertexShader

FragmentShader

C, C++,JavaScript GLSL GLSL

CPU

Fragment Shader

in vec4 fragPos; void main() { gl_FragColor = abs(fragPos); }

Vertex Shader

in vec4 position; in float dist; out vec4 fragPos; void main() { fragPos = position; gl_Position = position + dist; }

static const char *vertex_shader = "in vec4 position; ...";static const char *fragment_shader = "in vec4 fragPos; ..."; GLuint program = compileAndLink(vertex_shader, fragment_shader);// ... more boilerplate ... GLuint loc_dist = glGetUniformLocation(program, "dist");

CPU “Host Code”

glUseProgram(program); glUniform1f(loc_dist, 4.0);// ... assign other "in" parameters ... glDrawArrays(...);

setu

pre

nder

a fr

ame

"dits"

Übershader

#ifdef

#endif

#define

#if#endif

#ifndef

GPU shader specialization

Heterogeneous programming today

Separate programs in separate languages

Stringly typed communication

Unscalable, unsafe specialization

Heterogeneous programming languagesneed support for placement and specialization.

With extensions, multi-stage programmingcan support both concepts.

Current APIs for real-time graphics are especiallyunsafe, verbose, and brittle. We can help.

!<[]>

Classic multi-stage programming:types for metaprogramming

function pow(x, n) { if (n == 1) { return x; } else { return x * pow(x, n - 1); } }

pow(2, 3) 8

genpow("2", 3) "2 * 2 * 2"

eval(genpow("2", 3)) 8

function genpow(x, n) { if (n == 1) { return x; } else { return x * pow(x, n - 1); } }

genpow("2", 3) "2 * 2 * 2"

numberexpression (string)

Classic multi-stage programming:types for metaprogramming

function genpow(x, n) { if (n == 1) { return x; } else { return x + " * " + pow(x, n - 1); } }

genpow("2", 3) "2 * 2 * 2"

number

Classic multi-stage programming:types for metaprogramming

expression (string)

Specializing on acompile-time parameter

gl_FragColor = if matte diffuse (diffuse + ...)

gl_FragColor = [ if matte <diffuse> <diffuse + ...> ]

render-time parameter

condition on the GPU

host-side parameter

condition on the host

Performance impactof specialization in BraidGL

fram

e la

tenc

y (m

s)

0

2

4

6

8

10

12

14

original GPU if specialized per-vertex

fram

e la

tenc

y (m

s)

0

2

4

6

8

10

12

14

original if static if vertex0

2

4

6

8

10

12

14

16

orig no bump0

2

4

6

8

10

12

14

16

18

20

orig s1 s2 s3 s4

phong head couch

Performance impactof specialization in BraidGL

Heterogeneous programming languagesneed support for placement and specialization.

With extensions, multi-stage programmingcan support both concepts.

Current APIs for real-time graphics are especiallyunsafe, verbose, and brittle. We can help.

!<[]>

braidgl.com