GPU to Web SIGGRAPH Asia
-
Upload
the-khronos-group-inc -
Category
Technology
-
view
3.881 -
download
4
description
Transcript of GPU to Web SIGGRAPH Asia
© Copyright Khronos Group 2013 - Page 1
Bringing the Power of the GPU to the Web
Neil Trevett Vice President NVIDIA
President Khronos
© Copyright Khronos Group 2013 - Page 2
Mobile is the New Epicenter of Innovation
© Copyright Khronos Group 2013 - Page 3
Khronos Standards
Visual Computing - Object and Terrain Visualization - Advanced scene construction
3D Asset Handling - Advanced Authoring pipelines
- 3D Asset Transmission Format with streaming and compression
Sensor Processing - Mobile Vision Acceleration - On-device Sensor Fusion
Acceleration in the Browser - WebGL for 3D in browsers
- WebCL – Heterogeneous Computing for the web
Camera
Control API
Over 100 companies defining royalty-free
APIs to connect software to silicon
© Copyright Khronos Group 2013 - Page 4
Mobile Web is a Real Time Application
Buttery smooth touch interaction needs continuous
60Hz updates
Apple
iPhone
320x480
153K Pixels
163 DPI
Apple
iPad
1024x768
786K Pixels
132 DPI
2048x1536
3100K
Pixels
326 DPI
Apple
iPad Mini
In 5 years the number of
pixels to process on
mobile screens has gone
up by factor of TWENTY
+ =
Need GPU Acceleration for everything Web!
© Copyright Khronos Group 2013 - Page 5
How are GPUs Accessible to the Web? • Hardware composition
- Within the browser stack – under the hood
• Vector Acceleration for SVG
- Using NVIDIA OpenGL extensions
• 3D Developer Functionality
- OpenGL ES functionality through JavaScript
• Compute Acceleration
- Offloading compute intensive code to GPU
• Compression and streaming of 3D assets
- For network transmission
• Camera, vision and sensor processing
- Future JavaScript bindings to native APIs?
© Copyright Khronos Group 2013 - Page 6
Mobile OS Adoption of Khronos APIs
OpenGL ES 2.0 Shipping - Android 2.2
OpenSL ES 1.0 (subset)
Shipping – Android 2.3
OpenMAX AL 1.0 (subset)
Shipping - Android 4.0
EGL 1.4 Shipping under SDK -> NDK
Opera and Firefox WebGL now Chrome soon
OpenGL 3.2 on MacOS
OpenCL 1.2 on MacOS
OpenGL ES 3.0 on iOS
Can enable on MacOS Safari iOS5 enables WebGL for iAds
© Copyright Khronos Group 2013 - Page 7
WebGL – 3D on the Web – No Plug-in! • Leveraging HTML 5 and <canvas> element
- WebGL defines JavaScript binding to OpenGL ES 2.0
- Enables a 3D context for the canvas
• Low-level foundational Web API for accessing the GPU
- Flexibility and direct GPU access
- Enables higher-level frameworks and middleware
Availability of OpenGL and
OpenGL ES on almost every
web-capable device
JavaScript
binding to
OpenGL ES 2.0 Increasing JavaScript
performance.
HTML 5 Canvas Tag
© Copyright Khronos Group 2013 - Page 8
Content
JavaScript, HTML, CSS, ...
WebGL Implementation Anatomy
JavaScript Middleware
HTML5
JavaScript CSS
Browser provides WebGL functionality
alongside other HTML5 technologies
- no plug-in required
OS Provided Drivers. WebGL on Windows can
use Direct3D - for example Angle open source
project creates OpenGL ES 2.0 over DX9
OpenGL ES 2.0 OpenGL
DX9/Angle
Content downloaded from the Web.
Middleware can make WebGL accessible to
non-expert 3D programmers
Much WebGL content uses
three.js library:
http://threejs.org/
© Copyright Khronos Group 2013 - Page 9
WebGL Availability in Browsers
- Microsoft – “where you have IE11, you have WebGL – turned on by default and working all the time” - Microsoft - WebGL also enabled for Windows applications - web app framework and web view - Apple - WebGL must be explicitly turned on MAC Safari and only exposed on iOS for iAds - Chrome OS - WebGL is the only cross-platform API to program the GPU - Google IO announcement - Chrome on Android will soon launch with WebGL
© Copyright Khronos Group 2013 - Page 10
C/C++
SDK Dalvik (Java)
Objective C C#
DirectX
HTML/CSS HTML/CSS HTML/CSS
Cross-OS Portability
HTML5 provides cross
platform portability. GPU
accessibility through
WebGL available soon on
~90% mobile systems
Preferred development
environments not
designed for portability
Native code is portable-
but apps must cope with
different available APIs
and libraries
© Copyright Khronos Group 2013 - Page 11
WebGL First Wave Application Categories • Maps and Navigation
• Modeling Tools and Repositories
• Games
• 3D Printing
• Visualization
• Music Videos and Promotion
• Education
• Photo Editors
• Music Visualizers
• Vision/Video Processing
© Copyright Khronos Group 2013 - Page 12
Google Maps • All rendering (2D and 3D) in Google Maps uses WebGL
© Copyright Khronos Group 2013 - Page 13
Microsoft PhotoSynth2 • Demonstrated at Build 2013
http://channel9.msdn.com/Events/Build/2013/4-072 1:50
© Copyright Khronos Group 2013 - Page 14
WebGL on Logan Android Tablet
© Copyright Khronos Group 2013 - Page 15
WebGL on Logan Android Tablet
© Copyright Khronos Group 2013 - Page 16
OpenGL 3D API Family Tree
OpenGL ES 1.0
OpenGL ES 1.1 OpenGL ES 2.0 OpenGL ES 3.0
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013
OpenGL 1.5 OpenGL 2.0 OpenGL 4.3 OpenGL 2.1
OpenGL 3.0
OpenGL 3.1
OpenGL 3.2
OpenGL 3.3
OpenGL 4.0
OpenGL 4.1
OpenGL 4.2
2002
OpenGL 1.3
ES-Next
GL-Next
OpenGL ES 2.0
Content OpenGL ES 1.1
Content
OpenGL ES 3.0
Content
ES3 is backward compatible
so new features can be
added incrementally Fixed function
3D Pipeline
Programmable vertex
and fragment shaders
WebGL 1.0
OpenGL 4.4 is a
superset of DX11
WebGL 2.0
Desktop 3D
Mobile 3D
OpenGL 4.4
WebGL 2.0 is in development now -
will bring OpenGL ES 3.0
functionality to the Web http://www.khronos.org/webgl/public-mailing-list/
http://www.khronos.org/registry/webgl/specs/latest/
http://www.khronos.org/webgl/wiki/Testing/Conformance
© Copyright Khronos Group 2013 - Page 17
OpenGL ES 3.0 Highlights • Better looking, faster performing games and apps – at lower power
- Incorporates proven features from OpenGL 3.3 / 4.x
- 32-bit integers and floats in shader programs
- NPOT, 3D textures, depth textures, texture arrays
- Multiple Render Targets for deferred rendering, Occlusion Queries
- Instanced Rendering, Transform Feedback …
• Make life better for the programmer
- Tighter requirements for supported features to reduce implementation variability
• Backward compatible with OpenGL ES 2.0
- OpenGL ES 2.0 apps continue to run unmodified
• Standardized Texture Compression
- #1 developer request!
© Copyright Khronos Group 2013 - Page 18
Why Khronos for WebGL? • Hardware API standards must take into account silicon design cycles
- Multi-year pipeline of APIs that affect chips that take $100Ms to execute
- Deep insights into silicon and driver architectures
- Rigorous conformance tests and infrastructure
• Khronos is committed to being a good citizen in the larger Web community
- Opened Khronos WebGL processes to enable cooperation with web community
• Khronos is the industry forum to drive hardware consensus and cooperation
- Help create foundational support for higher-level Web standards that access
hardware capabilities
© Copyright Khronos Group 2013 - Page 19
Leveraging Proven Native APIs into HTML5 • Khronos and W3C liaison
- Leverage proven native API investments into the Web
- Fast API development and deployment
- Designed by the hardware community
- Familiar foundation reduces developer learning curve
Native APIs shipping
or Khronos working group
JavaScript API shipping,
acceleration being developed
or work underway
WebVX? Vision
Processing
WebCAM(!) Camera
control and
video
processing
Possible future
JavaScript APIs or
acceleration
WebStream? Sensor Fusion
Native
JavaScript Canvas
Path Rendering
Camera
Control
HTML
© Copyright Khronos Group 2013 - Page 20
OpenCL as Parallel Compute Foundation
C++
syntax/compiler
extensions
OpenCL HLM
JavaScript binding to
OpenCL for initiation
of OpenCL C kernels
WebCL River Trail
Language
extensions to
JavaScript
C++ AMP
Shevlin Park
Uses Clang
and LLVM
OpenCL provides vendor optimized,
cross-platform, cross-vendor access to
heterogeneous compute resources
Harlan
High level
language for GPU
programming
Compiler
directives for
Fortran C and C++
Aparapi
Java language
extensions for
parallelism
PyOpenCL
Python wrapper
around
OpenCL
© Copyright Khronos Group 2013 - Page 21
WebCL • WebCL is a JavaScript binding to OpenCL APIs
- Enables initiation of Kernels written in OpenCL C within the browser
- Requires a conformant underlying OpenCL on the host system
• Leverage heterogeneous computing resources
- 3D asset codecs, video codecs and processing, imaging and vision processing
- Physics for WebGL games, Online data visualization, Augmented Reality
• WebCL 1.0 based on OpenCL 1.1 Embedded Profile:
- Implementations may utilize OpenCL 1.1 or 1.2
• WebCL API is designed for complete security
- Restriction of some OpenCL native functionality
- WebCL kernel validation – similar to WebGL
© Copyright Khronos Group 2013 - Page 22
WebCL 1.0 Kernels • HTML data interoperability
- <canvas>, <image>, ImageData sources bindable to WebCLBuffer & WebCLImage
- <video> tag can be bound to a WebCLImage
• Interoperability between WebCL and WebGL
- Through GL_SHARING extension
• WebCL may support the following extensions
- KHR_FP16 — 16-bit float support in kernels
- KHR_FP64 — 64-bit float support in kernels
• No 3D image support in WebCL 1.0
- May change in future WebCL versions
© Copyright Khronos Group 2013 - Page 23
WebCL 1.0 Security • Leverages OpenCL 1.2 robustness/security extensions
- Context Termination - to prevent DOS from long running kernels
- Memory Initialization - so no leakage from out of bounds memory access
• Kernels passed through open source WebCL Kernel Validator
- https://github.com/KhronosGroup/webcl-validator
- Initializes local and private memory if underlying OpenCL implementation does
not implement memory initialization extension
- Keeps track of memory allocations and traces valid ranges for reads and writes
• API/Language Restrictions and definition of undefined OpenCL behavior
- Kernels do not support structures as arguments
- Kernels name must be less than 256 characters
- Mapping of CL memory objects into host memory space is not supported
- Binary kernels are not supported
- Some OpenCL Extension may not be supported or require translation
- Certain OpenCL parameters may not directly carry over to WebCL
© Copyright Khronos Group 2013 - Page 24
WebCL 1.0 Current Status • WebCL 1.0 API definition is being publicly developed
- Working Public Draft first released April 2012: www.khronos.org/webcl
• WebCL distribution lists
- [email protected], [email protected]
• WebCL1.0 specification finalization expected in 1H14 - https://cvs.khronos.org/svn/repos/registry/trunk/public/webcl/spec/latest/index.html
- With conformance tests and utilities
- Samsung contributed tests, working group reviewed
• WebCL Conformance Framework and Test Suite (WiP)
- Full API coverage and Input/output validation
- Available on GitHub: https://github.com/KhronosGroup/WebCL-conformance/
© Copyright Khronos Group 2013 - Page 25
OpenCL to WebCL Translator Utility • OpenCL to WebCL Kernel Translator
- Input: An OpenCL kernel
- Output: WebCL kernel, and a log file, that details the translation process
- Tracked by a “meta” bug on Khronos public Bugzilla - http://www.khronos.org/bugzilla/show_bug.cgi?id=785
• Host API translation (WiP)
- Input: an OpenCL host API calls
- Output: WebCL host API calls to be wrapped in JS
- Provides verbose translation log file, detailing the process and any constraints
- Tracked by a “meta” bug on Khronos public Bugzilla: - http://www.khronos.org/bugzilla/show_bug.cgi?id=913
25
© Copyright Khronos Group 2013 - Page 26
WebCL Prototype Implementations • Nokia - Firefox build with integrated WebCL
- Firefox extension, open sourced May 2011 (Mozilla Public License 2.0)
- https://github.com/toaarnio/webcl-firefox
• Samsung - uses WebKit, open sourced June 2011 (BSD)
- https://github.com/SRA-SiliconValley/webkit-webcl
• Motorola Mobility - uses Node.js, open sourced April 2012 (BSD)
- https://github.com/Motorola-Mobility/node-webcl
http://fract.ured.me/ Based on Iñigo Quilez, Shader Toy Based on Apple QJulia Based on Iñigo Quilez, Shader Toy
© Copyright Khronos Group 2013 - Page 27
WebCL Parallel Computing for Web Acceleration
http://www.youtube.com/user/SamsungSISA#p/a/u/1/9Ttux1A-Nuc
© Copyright Khronos Group 2013 - Page 28
Khronos APIs for Augmented Reality
Advanced Camera Control and stream
generation
3D Rendering and Video
Composition
On GPU
Audio
Rendering
Application
on CPUs, GPUs
and DSPs
Sensor
Fusion
Vision
Processing
MEMS
Sensors
Camera Control
API
EGLStream - stream data
between APIs
Precision timestamps
on all sensor samples
AR needs not just advanced sensor processing, vision acceleration, computation
and rendering - but also for all these subsystems to work efficiently together
W3C Augmented Web Community Group discussing many of these
issues for the Web: e.g. leveraging WebRTC in the short term
http://w3.org/community/ar
© Copyright Khronos Group 2013 - Page 29
3D Needs a Transmission Format! • Compression and streaming of 3D assets becoming essential
- Mobile and connected devices need access to increasingly large asset databases
• 3D is the last media type to define a compressed format
- 3D is more complex – diverse asset types and use cases
• Needs to be royalty-free
- Avoid an ‘internet video codec war’ scenario
• Eventually enable hardware implementations of successful codecs
- High-performance and low power – but pragmatic adoption strategy is key
Audio Video Images 3D
MP3 H.264 JPEG ? !
An effective and widely adopted codec ignites previously
unimagined opportunities for a media type
© Copyright Khronos Group 2013 - Page 30
glTF – OpenGL Transmission Format • Binary file format for efficient transmission for 3D assets
- Reduce network bandwidth and minimize client processing overhead
• Run-time neutral - DO NOT IMPLY OR MANDATE ANY RUN-TIME BEHAVIOR
- Can be used by any app or run-time – usually WebGL accelerated
• Scalable to handle compression and streaming
- Though baseline format does not include compression
• ‘Direct load efficiency’ for WebGL
- Little or NO processing to drop glTF data into WebGL client
• Carry conditioned data from any authoring format
- Prototyping and optimizing efficient handling of COLLADA assets
A standards-based
content pipeline for
rich native and Web 3D
applications Playback Authoring
© Copyright Khronos Group 2013 - Page 31
COLLADA and glTF Open Source Ecosystem
Tool Interop
Three.js glTF Importer. Rest3D initiative
COLLADA2GLTF
Translator
OpenCOLLADA
Importer/Exporter
and COLLADA
Conformance Tests
On GitHUB
Pervasive WebGL deployment
Other
authoring
formats
Web-based Tools
https://github.com/KhronosGroup/glTF
https://github.com/KhronosGroup/OpenCOLLADA
https://github.com/KhronosGroup/COLLADA-CTS
© Copyright Khronos Group 2013 - Page 32
WebGL as Test-bed for 3D Asset Compression • Integrating and benchmarking 3D geometry compression formats with glTF
- Baseline is GZIP
• Scalable Complexity 3D Mesh Compression codec MPEG-SC3DMC
- Royalty-free graphics compression technology from MPEG (MIT License)
- Open3DGC is efficient JavaScript and C/C++ implementation
- Convertor using Open3DGC to compress 3D Meshes, Skinning, Animations
- https://github.com/amd/rest3d/tree/master/server/o3dgc
• WebGL-loader is Google lightweight compression for WebGL content
• OpenCTM uses LZMA compression
© Copyright Khronos Group 2013 - Page 33
Initial Compression Results • Compression Efficiency
- Gzip (default level=6)
- OpenCTM (default settings)
- Open3DGC and Webgl-loader - Positions on 14 bits
- Normals and texCoords on 10 bits
Open3DGC is 5x-9x more efficient than Gzip
1.3x-2.4x more efficient than OpenCTM and
1.2x-1.5x more efficient than webgl-loader
0
100
200
300
400
CAD(3748 models)
3D Scanned(78 models)
MPEG dataset(1211 models)
Size
(M
Byt
es)
Gzip
OpenCTM
Webgl-loader + Gzip
Open3DGC-ASCII + Gzip
Open3DGC-Binary
© Copyright Khronos Group 2013 - Page 34
3DGC Decode Times • Javascript Decoding Speed
- Desktop machine - Windows® 64-bit, 8GB RAM, Chrome
- AMD Phenom™ II X4 B95 CPU @ 3.0GHz
- Smart phone - Samsung Galaxy S4
- Android 4.2.2
- Chrome
Number of triangles
Desktop decoding time (ms)
Smart phone decoding time (ms)
“Hand” 100K 130 1045
“Dilo” 54K 85 768
“Octopus” 34K 65 457
Decoding speed will become even more critical with dense 3D meshes
generated by 3D digitization technologies (e.g. 3D scanners)
3D Codec can be accelerated by WebCL Kernels or (eventually) hardware
© Copyright Khronos Group 2013 - Page 35
Texture Compression is Key •Texture compression saves precious resources
- Network bandwidth, device memory space AND device memory bandwidth
•Developers need the same texture compression EVERYWHERE - Otherwise portable apps – such as WebGL need multiple copies of same texture
DXTC/S3TC Windows
PVRTC iOS
ETC1 Mandated in
Android Froyo
(400M devices)
ETC2 / EAC MANDATED in
OpenGL ES 3.0
OpenGL 4.3
ASTC OpenGL ES 3.0
and OpenGL 4.3
extensions -> Core
once proven
Pervasive Deployment
Quality
NOT Royalty-free.
Platform
Fragmentation
Royalty-free
BUT only optional in ES.
Only 4bpp | 3 channel
No alpha support
Royalty-free
Backward compatible with ETC1
ETC2: 4bpp | 3 channel
EAC: 4 (8) bpp | 1(2) channel
COMBINED: RGBA 8bpp | 4 channel
Does not have 1-2 bit compression
WITH ALPHA
Royalty-free
Best quality.
Independent control of bit-rate
and # channels
1 to 4 channel
1-8bpp in fine steps
2008-2010 2012-2013 2014->
© Copyright Khronos Group 2013 - Page 36
ASTC – Universal Texture Standard • Adaptive Scalable Texture Compression (ASTC)
- Quality significantly exceeds S3TC or PVRTC at same bit rate
• Industry-leading orthogonal compression rate and format flexibility - 1 to 4 color components: R / RG / RGB / RGBA
- Choice of bit rate: from 8bpp to <1bpp in fine steps
• ASTC is royalty-free and so is available to be universally adopted - Shipping as OpenGL/OpenGL ES extension today for industry feedback
Original
24bpp
ASTC Compression
8bpp 3.56bpp 2bpp
© Copyright Khronos Group 2013 - Page 37
Path Rendering Acceleration • Offload the CPU so the application can run as fast as possible
- Make maximum use of the GPU for best performance and power
CPU creates paths
Use standard 3D commands to
process polygons
CPU renders paths
CPU creates paths
CPU tessellates paths into polygons
Define new OpenGL path commands to
process paths directly
CPU creates paths
- Software Scanline renderers can
be high quality and portable
- CPU has to process complete
pipeline – stealing cycles
from the application - Software rendering limits
performance
- Tessellation loads the CPU – stealing
cycles from the application so perf
sometimes slower than software alone
- Tessellation consumes a lot of data
and memory bandwidth = power - Quality can be compromised due to
tessellation accuracy
CPU
GPU
- Maximum CPU offload
- Compact data format sent
to GPU renderer
- GPU provides excellent
performance and power - GPU can increase quality
and functionality
© Copyright Khronos Group 2013 - Page 38
NV_path_rendering OpenGL Extension • Brings Path processing directly to OpenGL
- No tessellation necessary
• Goals
- Functionally complete for key standards: SVG, Canvas, PostScript etc.
- Much faster—often 4x to 100x faster than CPUs
- Enhanced quality – can avoid approximations needed by CPU renderers
- Lower power by leveraging dedicated hardware
- New functionality – e.g. mix 2D paths with 3D and programmable shading
© Copyright Khronos Group 2013 - Page 39
Stencil then Cover Approach • Create a path object and pass directly to the GPU
- Cubic & quadratic Bezier segments, line segments, partial elliptical arcs
• GPU “Stencils” the path object into the stencil buffer
- GPU provides massively parallel stenciling of filled or stroked paths
- Calculate winding rule or containment at every sub-pixel sample in parallel
• “Cover” the path object and stencil test against its coverage
- Test against path coverage determined in the 1st step and shade the path
• Uses GPU MSAA anti-aliasing
- 8 or 16 samples/pixel gives good quality
Step 1
Stencil Step 2:
Cover
repeat
© Copyright Khronos Group 2013 - Page 40
Enhanced Quality on GPU
conflation artifacts on CPU Conflation free on GPU
Eliminate Conflation Artifacts
Multiple color AND stencil samples per pixel
color bleeding
Cairo NV_path_rendering Skia
feathers? weird big holes
Stroking approximations avoided by GPU regular grid on CPU - sub-optimal Antialiasing
jitter pattern on GPU for better Antialiasing
GPU Offers Jittered Sampling for Free
GPU
Qt
Cairo
Moiré
artifacts
Similar
for Qt &
Skia
Proper gradient filtering on GPU
GPUs great at texturing:
Mip-mapping
Anisotropic filtering
Wrap modes
© Copyright Khronos Group 2013 - Page 41
Comparing Performance
© Copyright Khronos Group 2013 - Page 42
New GPU Functionality
light source position for BUMP Mapping
Programmable Shading Paint in GLSL – for filter and
blending acceleration
Projective
Transformation
Fast Arbitrary
Path Clipping
Mixing depth tested
Text, 3D, and Paths
linear RGB transition between saturated red and saturated
blue has dark purple region
sRGB perceptually smooth transition from
saturated red to saturated blue
Fully sRGB Correct Rendering
© Copyright Khronos Group 2013 - Page 43
Mixing 2D and 3D
© Copyright Khronos Group 2013 - Page 44
Standardization and Adoption Pipeline • NVIDIA is proposing nvpr to OpenGL working group at Khronos to create open,
royalty-free cross platform foundation for vector graphics acceleration
Vendor Extension to OpenGL
OpenGL Extension
or Core
Vector acceleration pervasive on desktop
and mobile
Initial functionality proposal.
Prove concepts.
Solicit industry feedback
Pervasive multi-vendor availability.
Widespread application usage
inspires silicon optimizations
nvpr is here!
OpenGL vector acceleration adopted into OpenGL and OpenGL ES
Desktop and mobile displays typically
>300 DPI
Mobile silicon is CUDA/OpenCL capable
© Copyright Khronos Group 2013 - Page 45
Path Rendering Acceleration on Android Tablet
© Copyright Khronos Group 2013 - Page 46
Summary • Open standards such as WebGL and WebCL are enabling web applications to
reach the power of the GPU through JavaScript
• GPU acceleration will soon become vital for Web applications wanting to
leverage advanced use of camera and sensors
• Direct acceleration of path primitives directly on GPUs will drive browser smooth touch performance for new classes of applications and devices
• Work starting on 3D asset streaming and compression standards – to enable 3D as a social media type on the web
• The Web and hardware community have significant opportunity to leverage each
others efforts for the benefit of the industry
• Khronos is committed to enable the hardware community to be a good citizen in
creating the next generation of accelerated web standards
• www.khronos.org