Post on 24-Apr-2020
© Copyright Khronos Group 2013 - Page 1
Together We Stand Growing the AR Market through Open Standards
© Copyright Khronos Group 2013 - Page 2
Why are AR Standards Needed?
Courtesy Metaio http://www.youtube.com/watch?v=xw3M-TNOo44&feature=related
State-of-the-art Augmented Reality on mobile today
© Copyright Khronos Group 2013 - Page 3
Where AR Standards Can Take Us
High-Quality Reflections, Refractions, and Caustics in Augmented Reality and their Contribution to Visual Coherence
P. Kán, H. Kaufmann, Institute of Software Technology and Interactive Systems, Vienna University of Technology, Vienna, Austria
This is a research topics today on GPU equipped laptop PCs.
We need this class of
application to:
1. Seamlessly integrate content from multiple sources
2. Run in any browser on any device without porting effort
3. Run on mobile devices at 60Hz at 500 mW or less
© Copyright Khronos Group 2013 - Page 4
Your Presenters • Rob Manson is CEO & co-founder of buildAR.com, the world’s leading Augmented
Reality Content Management System. Rob is the Chair of the W3C Augmented
Web CG and an Invited Expert with the ISO, W3C and the Khronos Group. He is
one of the co-founders of ARStandards.org and is an active evangelist within the
global AR and standards communities
• Neil Trevett is the President of the Khronos Group, where he created
and chaired the OpenGL ES working group, which has defined the industry
standard for 3D graphics on embedded devices. Neil also chairs the OpenCL
working group at Khronos defining an open standard for heterogeneous
computing. He is also Vice President of Mobile Content at NVIDIA where he
drives the mobile visual computing application ecosystem
© Copyright Khronos Group 2013 - Page 5
The Topics for this Morning • Why do we need open standards for AR?
- What makes successful standards?
• The AR Standardization Landscape
- What organizations that are working in AR, how do they relate?
• AR Acceleration Standards
- Khronos acceleration API standards for low power and high performance AR
• Bringing accelerated AR into the Browser
- WebRTC and WebGL – revolutionary access to the camera and GPU in standard browsers
- The W3C Augmented Web Community Group and how to get involved
• AR Content Standards
- ARML (Augmented Reality Markup Language) and the OGC
- How to make content re-usable without stifling innovation
© Copyright Khronos Group 2013 - Page 6
Why Do We Need Standards? • Standards are interoperability interfaces so that compelling user experiences
can be created inexpensively to build mass markets
- Don’t slow growth with functionality fragmentation that adds no value
• E.g. Wireless and IO standards
- GSM/EDGE, UMTS/HSPA, LTE, IEEE 802.11, Bluetooth, USB …
Standards drive mobile market
growth by expanding device
capabilities
© Copyright Khronos Group 2013 - Page 7
Standards in the Real World
Vendor differences adding no value -
fragmentation is slowing growth – clear
goals emerge for a standard
REFINE BY COMMITTEE Industry agrees on what to standardize
– cooperative refinement from multiple
viewpoints creates a robust solution
A good standard enables
implementation innovation
Darwinian industry is still
experimenting with what works
and what doesn’t
DESIGN BY COMMITTEE Experimentation and design by
committee can be
slow and unfocused
A bad standard stifles innovation
and causes commoditization
Right time to Standardize?
Every successful open standard has a de facto proprietary
competitor and is open to competitive evolution
Ecosystems seem to work best when both are
healthy and evolving
© Copyright Khronos Group 2013 - Page 8
Busting Some Standardization Myths • “Standards are slow to develop”
- Time to productive ecosystems is the key …
… rather than minimizing time to a proprietary specification
- OpenCL 1.0 took just 6 months – intensive cooperation
• “If I particpate in standards I ‘lose’ my IP”
- Good IP Framework fully protects Members IP and the specification
- Members agree not to assert IP claims against other Members or Adopters
- License grant is VERY narrow – but protects the specification in practice
• “Standards are boring”
- A good standard is the industry coming together to solve real issues
© Copyright Khronos Group 2013 - Page 9
Standards Needed by AR
Silicon and
Sensors
Platform
APIs
Application
Software
Cloud Services
Asset Authoring
and Delivery
GeoData and
Sensor Networks
APIs for access to graphics, compute, media, camera, vision and sensor processing
3D asset transmission format
Defining the platform capabilities of Web browsers
Reference Model for common
terminology across the industry
Defining interfaces for connecting
hardware components
Video, audio and 3D asset
compression and streaming formats
Geospatial formats and services
© Copyright Khronos Group 2013 - Page 10
ISO Reference Model Architecture
MAR Engine
Sensors /
Actuators
MAR Scene
Descriptions
Additional
Media
Services
Display /
UI
AR
Defining a Reference Model
for Mixed and Augmented
Reality Systems Defining common terminology
for use by the Industry
© Copyright Khronos Group 2013 - Page 11
SC24 WG9 and SC29 WG11 collaboration
SC29 WG11 AR- Ref Model document
(DTR in May 2012)
MAR Reference Model WD1.0 (N13614)
Incheon Meeting April 2013
MAR Reference Model WD1.5 LA Meeting July 2013
Liaison established
SC29 invited to participate
to SC24 meetings &vice versa
MAR Reference Model: candidate WD2.0 (N13654)
Vienna Meeting August 2013
JTC1 created a JAhG
on MAR in November
2012
MAR Reference Model: candidate WD3.0 (Nxxxx)
Geneva, October 2013
MAR Reference Model: candidate WD4.0 (Nxxxx)
Seoul, January 2014
Next meetings
SC24 WG9 ARC- Ref Model document
(officially approved in August 2012)
© 2013 Open Geospatial Consortium
© Copyright Khronos Group 2013 - Page 12
Source of slide: Steve Liang, Univ. Calgary and chairman of OGC Sensor Web 4 IoT
Progression of Geospatial Information
© 2013 Open Geospatial Consortium, Inc.
Region-Centric
Geospatial
Information
Feature-Centric
Geospatial
Information
Human-Centric
Geospatial
Information
Device-Centric
Geospatial
Information
1980s 1990s 2000s 2010s
© 2013 Open Geospatial Consortium
© Copyright Khronos Group 2013 - Page 13
Progression of Geospatial Information
Region-Centric
Geospatial
Information
Feature-Centric
Geospatial
Information
Human-Centric
Geospatial
Information
Device-Centric
Geospatial
Information
Indoor Space
IoT Space
© 2013 Open Geospatial Consortium
© Copyright Khronos Group 2013 - Page 14
OGC AR Standards Activity • GeoPackage
- New Universal Geodata Format
• 3D Portrayal Services
- Services for accessing and displaying city data
• Sensor Web for IoT
- Discovery and tasking of sensor assets
• OpenPOI
- Encoding for Points of Interest in coordination with W3C
• Open GeoSMS
- SMS text message with location URL in coordination with ITU
• IndoorGML
- Indoor navigation and routing based on OGC CityGML
- Leveraging KML using COLLADA from Khronos
• ARML 2.0
- Language to describe and interact with Augmented Reality Scenes
- For AR Browsers and apps
© 2013 Open Geospatial Consortium
© Copyright Khronos Group 2013 - Page 15
GeoPackage - New Universal Geodata Format • GeoPackage is a universal file format for geodata
- Open, standards-based, application and platform independent
- Self-describing
- Built on SQLite, so works on any desktop or mobile OS
• The modern alternative to formats like GeoTIFF, SDTS and shapefiles
- http://www.ogcnetwork.net/geopackage
15 © 2013 Open Geospatial Consortium
© Copyright Khronos Group 2013 - Page 16
3D Portrayal Services • Service interface for web-based scene graph rendering and image based
rendering of 3D city models
• Use several encodings
• OGC 3D Portrayal Standards Working Group
Scenegraph
Images
© 2013 Open Geospatial Consortium
© Copyright Khronos Group 2013 - Page 17
OGC Sensor Web Enablement (SWE) • Discovery and tasking of sensor assets, and the access and application of sensor
observations for enhanced situational awareness
- Sensor Model Language
- Observations & Measurements
- Sensor Observation Service (SOS)
- Sensor Planning Service (SPS)
- Sensor Alert Service (SAS)
- Catalogue Service/Sensors
© 2013 Open Geospatial Consortium
© Copyright Khronos Group 2013 - Page 18
Points of Interest • A registry of all the places in the world, and links to all of their web resources
- http://openpois.net
- APIs to get the information as maps, XML, or JSON
- A location resource that’s always current, accurate, and authoritative
• Products & Services
- Business Pages - “white hat”, free and open B2B business registry
- http://openpois.net/b2b/
- POI Finder - API for finding POIs near a location
- example: http://openpois.net/poiquery.php?lat=42.3494&lon=-71.0408&format=json
• POI Standards development
- Began in W3C with OGC participation (Raj Singh)
- OGC formed a Standards Working Group to complete work
http://www.opengeospatial.org/projects/groups/poiswg
© 2013 Open Geospatial Consortium
© Copyright Khronos Group 2013 - Page 19
AR Standards Community • Seeks open and interoperable AR content and experiences
- To accelerate the process by which barriers to open and interoperable
AR content and experiences are reduced through industry-wide collaboration
- Grassroots community – no formal membership
• Conducts Face-to-Face meetings and online resources
- A Web portal and five archived mailing lists
- The Discussion Archive - http://arstandards.org/pipermail/discussion/
- The Announcements Archive - http://arstandards.org/pipermail/news
• Participants
- Industry (companies of all sizes)
- Government
- Academia
- Individuals
19
© Copyright Khronos Group 2013 - Page 20
AR Alliance for Enterprise • Evolution of AR Standards Community
• Focus on beachhead industrial and enterprise AR apps
Open and
Interoperable
AR content and
Experiences
Standards
Informed
user/buyers
Skilled developers/
integrators
Secure AR
Vibrant AR Ecosystem
Sound
business models
Powerful
AR-enabled
platforms
AR enablers
suitable for
IT system integration
Objective,
high quality
market education
tools and programs
Customers reaping
benefits of AR in
all industries
Applied
Research
ARAE
© Copyright Khronos Group 2013 - Page 21
AR Alliance Product Goals • User/buyer and service provider education
• Results for public and members
- Lower need for individual selling (buyer awareness)
- Greater awareness of what creating AR involves
- More realistic expectations (lower FUD)
• Results for members (derived benefits)
- More successful adoption of AR (faster, more effective implementations)
- Marketplace growth (diversification of ecosystem)
- Early awareness of developments
- Larger skilled workforce
Name Who Benefits How achieved
Implementation Examples (Case Studies documented in writing, slides and video)
Members (access details) Public (receive executive summary)
Compiled by members Professional layout/production, editor
White papers Members (all). Public (limited set) Member volunteers assisted by editors
Operate community Public + members Member volunteers, staff and exec director
Expert board Members staff and exec director
Speaker bureau Public + members (only member speakers) staff and exec director
ARAE Contact:
Christine Perey <cperey@perey.com>
© Copyright Khronos Group 2013 - Page 22
Khronos Connects Software to Silicon
ROYALTY-FREE, OPEN STANDARD APIs for
advanced hardware acceleration
Low level silicon to software interfaces needed on every platform
Graphics, video, audio, compute,
vision, sensor and camera processing
Defines the forward looking roadmap for
the silicon community
Shipping on billions of devices across
multiple operating systems
Rigorous conformance tests for
cross-vendor consistency
Khronos is OPEN for any company to
join and participate
Acceleration APIs BY the Industry
FOR the Industry
© Copyright Khronos Group 2013 - Page 23
Khronos Standards and AR
Visual Computing - Object and Terrain Visualization - Advanced scene construction
3D Asset Authoring - Advanced Authoring pipelines
- CityGML and KML
- - 3D asset transmission
Sensor Processing - Mobile Vision Acceleration - On-device Sensor Fusion
Acceleration in the Browser - WebGL for 3D in browsers
- WebCL – GPU Compute in Browser - glTF 3D Transmission Format
Camera
Control API
© Copyright Khronos Group 2013 - Page 24
Accelerating AR to Meet User Expectations • Mobile is an enabling platform for Augmented Reality
- Mobile SOC and sensor capabilities are expanding quickly
• But we need mobile AR to be 60Hz buttery smooth AND low power
- Power is now the main challenge to increasing quality of the AR user experience
• What are the silicon acceleration APIs on today’s mobile OS
- And how they can be used to optimize AR performance AND power
• Highlight silicon-level opportunities and challenges still be to solved
- While exploring the state of the art in mobile programming
SOC = ‘System On Chip’
Complete compute system minus memory and some peripherals
© Copyright Khronos Group 2013 - Page 25
Mobile SOC Performance Increases
1
100
CPU
/GPU
AG
GR
EG
AT
E P
ER
FO
RM
AN
CE
2013 2015
Tegra 4 Quad A15
2014
2011
2012
Tegra 2 Dual A9
Tegra 3 Quad A9
Power saver 5th core
Logan
10
Parker
HTC One X+
Google Nexus 7
100x perf
increase in
four years
Device Shipping Dates
Full Kepler GPU
CUDA 5.0
OpenGL 4.3
Denver 64-bit CPU
Maxwell GPU
© Copyright Khronos Group 2013 - Page 26
NVIDIA Logan Mobile SOC
Kepler GPU Architecture
now on PC and Mobile.
Can run essentially the
same code – scaled for
different power
constraints
© Copyright Khronos Group 2013 - Page 27
Power is the New Design Limit • The Process Fairy keeps bringing more transistors..
..but the ‘End of Voltage Scaling’ means power
is much more of an issue than in the past
In the Good Old Days Leakage was not important, and voltage
scaled with feature size
L’ = L/2
D’ = 1/L2 = 4D
f’ = 2f
V’ = V/2
E’ = CV2 = E/8
P’ = P
Halve L and get 4x the transistors and
8x the capability for
the same power
The New Reality Leakage has limited threshold voltage,
largely ending voltage scaling
L’ = L/2
D’ = 1/L2 = 4D
f’ = ~2f
V’ = ~V
E’ = CV2 = E/2
P’ = 4P
Halve L and get 4x the transistors and
8x the capability for
4x the power!!
© Copyright Khronos Group 2013 - Page 28
Mobile Thermal Design Point
2-4W 4-7W
6-10W 30-90W
4-5” Screen takes
250-500mW
7” Screen
takes 1W
10” Screen takes 1-2W
Resolution makes a difference -
the iPad3 screen takes up to 8W!
Typical max system power levels before thermal failure
Even as battery technology improves - these thermal limits remain
© Copyright Khronos Group 2013 - Page 29
How to Save Power?
• Much more expensive to
MOVE data than COMPUTE data
• Process improvements WIDEN the gap
- 10nm process will increase ratio another 4X
• Energy efficiency must be key metric during
silicon AND app design
- Awareness of where data lives, where
computation happens, how is it scheduled
32-bit Integer Add 1pJ
32-bit Float Operation 7pJ
32-bit Register Write 0.5pJ
Send 32-bits 2mm 24pJ
Send 32-bits Off-chip 50pJ
For 40nm, 1V process
Write 32-bits to LP-DDR2 600pJ
© Copyright Khronos Group 2013 - Page 30
Hardware Save Power e.g. Camera Sensor ISP • CPU
- Single processor or Neon SIMD - running fast
- Makes heavy use of general memory
- Non-optimal performance and power
• GPU
- Programmable and flexible
- Many way parallelism - run at lower frequency
- Efficient image caching close to processors
- BUT cycles frames in and out of memory
• Camera ISP (Image Signal Processor)
- Little or no programmability
- Data flows thru compact hardware pipe
- Scan-line-based - no global memory
- Best perf/watt
~760 math Ops
~42K vals = 670Kb
300MHz ~250Gops
© Copyright Khronos Group 2013 - Page 31
Dark Silicon • GPUs are much more power efficient than CPUs
- When exploiting data parallelism can x10 as efficient – but can go further…
• Lots of space for transistors on SOC – but can’t turn them all on at same time!
- Would exceed Thermal Design Point
• Dark Silicon - specialized hardware – only turned on when needed
- Dedicated units can increase locality and parallelism of computation
Power Efficiency
Computation Flexibility
Enabling new mobile experiences requires pushing computation onto GPUs and
dedicated hardware
Dedicated Hardware
GPU Compute
Multi-core CPU
X1
X10
X100
© Copyright Khronos Group 2013 - Page 32
OpenCL – Heterogeneous Computing
• Native framework for programming diverse
parallel computing resources
- CPU, GPU, DSP – as well as hardware blocks(!)
• Powerful, low-level flexibility
- Foundational access to compute resources for
higher-level engines, frameworks and languages
• Embedded profile
- No need for a separate “ES” spec
- Reduces precision requirements
A cross-platform, cross-vendor standard for
harnessing all the compute resources in an SOC
OpenCL
Kernel
Code
OpenCL
Kernel
Code
OpenCL
Kernel
Code
OpenCL
Kernel
Code
GPU
DSP
One code tree can be executed on
CPUs, GPUs, DSPs and hardware.
Dynamically interrogate system load
and load balance work across
available processors
CPU
CPU HW
© Copyright Khronos Group 2013 - Page 33
OpenCL Overview • C Platform Layer API
- Query, select and initialize compute devices
• Kernel Language Specification
- Subset of ISO C99 with language extensions
- Well-defined numerical accuracy - IEEE 754 rounding with specified max error
- Rich set of built-in functions: cross, dot, sin, cos, pow, log …
• C Runtime API
- Runtime or build-time compilation of kernels
- Execute compute kernels across multiple devices
• Memory management is explicit
- Application must move data from
host global local and back
- Implementations can optimize data movement
in unified memory systems
© Copyright Khronos Group 2013 - Page 34
OpenCL: Execution Model • Kernel
- Basic unit of executable code ~ C function
- Data-parallel or task-parallel
• Program
- Collection of kernels and functions
~ dynamic library with run-time linking
• Command Queue
- Applications queue kernels & data transfers
- Performed in-order or out-of-order
• Work-item
- An execution of a kernel by a processing
element ~ thread
• Work-group
- A collection of related work-items that execute
on a single compute unit ~ core
Example of parallelism types
© Copyright Khronos Group 2013 - Page 35
OpenCL Built-in Kernels • Used to control non-OpenCL C-capable
resources on an SOC – ‘Custom Devices’
- E.g. Video encode/decode, Camera ISP …
• Represent functions of Custom Devices as an
OpenCL kernel
- Can enqueue Built-in Kernels to Custom
Devices alongside standard OpenCL kernels
• OpenCL run-time a powerful coordinating
framework for ALL SOC resources
- Programmable and custom devices
controlled by one run-time
Built-in kernels enable control of specialized processors and hardware
from OpenCL run-time
© Copyright Khronos Group 2013 - Page 36
OpenCL SPIR 1.2 Provisional released!
OpenCL Roadmap
OpenCL 2.0
Significant enhancements to memory and execution models to
expose emerging hardware capabilities and provide increased
flexibility, functionality and performance to developers
OpenCL SPIR (Standard Parallel Intermediate Representation)
LLVM-based, low-level Intermediate Representation for IP Protection and as
target back-end for alternative high-level languages
OpenCL HLM (High Level Model)
High-level programming model, unifying host and device execution environments through
language syntax for increased usability and broader optimization opportunities
OpenCL 2.0 Provisional released!
© Copyright Khronos Group 2013 - Page 37
Mobile OpenCL Shipping • Android ICD extension released in latest extension specification
- OpenCL implementations can be discovered and loaded as a shared object
• Multiple implementations shipping in Android NDK
- ARM, Imagination, Vivante, Qualcomm, Samsung …
© Copyright Khronos Group 2013 - Page 38
Key OpenCL 2.0 Features • Shared Virtual Memory
- Host and device kernels can directly share complex, pointer-containing data
structures such as trees and linked lists, providing significant programming
flexibility and eliminating costly data transfers between host and devices
• Dynamic Parallelism
- Device kernels can enqueue kernels to the same device with no host interaction,
enabling flexible work scheduling paradigms and avoiding the need to transfer
execution control and data between the device and host, often significantly
offloading host processor bottlenecks
• Generic Address Space
- Functions can be written without specifying a named address space for
arguments, especially useful for those arguments that are declared to be a
pointer to a type, eliminating the need for multiple functions to be written for
each named address space used in an application
© Copyright Khronos Group 2013 - Page 39
Key OpenCL 2.0 Features – continued… • Images
- Improved image support including sRGB images and 3D image writes, the ability
for kernels to read from and write to the same image, and the creation of
OpenCL images from a mip-mapped or a multi-sampled OpenGL texture for
improved OpenGL interop
• C11 Atomics
- Subset of C11 atomics and synchronization operations to enable assignments in
one work-item to be visible to other work-items in a work-group, across work-
groups executing on a device or for sharing data between OpenCL device and host
• Pipes
- Pipes are memory objects that store data organized as a FIFO. OpenCL 2.0
provides built-in functions for kernels to read from or write pipes, providing
straightforward programming that can be highly optimized by implementers
© Copyright Khronos Group 2013 - Page 40
OpenCL as Parallel Compute Foundation
C++
syntax/compiler
extensions
OpenCL HLM
JavaScript binding to
OpenCL for initiation
of OpenCL C kernels
WebCL River Trail
Language
extensions to
JavaScript
C++ AMP
Shevlin Park
Uses Clang
and LLVM
OpenCL provides vendor optimized,
cross-platform, cross-vendor access to
heterogeneous compute resources
Harlan
High level
language for GPU
programming
Compiler
directives for
Fortran C and C++
Aparapi
Java language
extensions for
parallelism
PyOpenCL
Python wrapper
around
OpenCL
© Copyright Khronos Group 2013 - Page 41
OpenGL 3D API Family Tree
OpenGL ES 1.0
OpenGL ES 1.1 OpenGL ES 2.0 OpenGL ES 3.0
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013
OpenGL 1.5 OpenGL 2.0 OpenGL 4.3 OpenGL 2.1
OpenGL 3.0
OpenGL 3.1
OpenGL 3.2
OpenGL 3.3
OpenGL 4.0
OpenGL 4.1
OpenGL 4.2
2002
OpenGL 1.3
ES-Next
GL-Next
OpenGL ES 2.0
Content OpenGL ES 1.1
Content
OpenGL ES 3.0
Content
ES3 is backward compatible
so new features can be
added incrementally Fixed function
3D Pipeline
Programmable vertex
and fragment shaders
WebGL 1.0
OpenGL 4.4 is a
superset of DX11
WebGL-Next
Desktop 3D
Mobile 3D
OpenGL 4.4
© Copyright Khronos Group 2013 - Page 42
OpenGL ES 3.0 Highlights • Better looking, faster performing games and apps – at lower power
- Incorporates proven features from OpenGL 3.3 / 4.x
- 32-bit integers and floats in shader programs
- NPOT, 3D textures, depth textures, texture arrays
- Multiple Render Targets for deferred rendering, Occlusion Queries
- Instanced Rendering, Transform Feedback …
• Make life better for the programmer
- Tighter requirements for supported features to reduce implementation variability
• Backward compatible with OpenGL ES 2.0
- OpenGL ES 2.0 apps continue to run unmodified
• Standardized Texture Compression
- #1 developer request!
© Copyright Khronos Group 2013 - Page 43
DirectX 11.1
2004 2006 2008 2009 2010 2005 2007 2011
Accelerating OpenGL Innovation
DirectX 10.1
OpenGL 2.0 OpenGL 2.1 OpenGL 3.0
OpenGL 3.1
DirectX 9.0c DirectX 10.0 DirectX 11
OpenGL 3.2
OpenGL 3.3/4.0
OpenGL 4.1
Bringing state-of-the-art functionality to cross-platform graphics
2012
OpenGL 4.2
OpenGL 4.4
2013
OpenGL 4.3
© Copyright Khronos Group 2013 - Page 44
OpenGL 4.3 Compute Shaders • Execute algorithmically general-purpose GLSL shaders
- Can operate on uniforms, images and textures
• Process graphics data in the context of the graphics pipeline
- Easier than interoperating with a compute API IF processing ‘close to the pixel’
• Standard part of all OpenGL 4.3 implementations
- Matches DX11 DirectCompute functionality
Physics AI Simulation Ray Tracing Imaging Global Illumination
© Copyright Khronos Group 2013 - Page 45
OpenCL and OpenGL Compute Shaders • OpenGL compute shaders and OpenCL support distinctly different use cases
- OpenCL provides a significantly more powerful and complete compute solution
Enhanced 3D
Graphics apps
“Shaders++”
Pure compute
apps touching
no pixels
Compute Shaders
1. Full ANSI C programming of
heterogeneous CPUs and GPUs
2. Utilize multiple processors
3. Precisely defined IEEE accuracy
1. Fine grain compute operations
inside OpenGL
2. GLSL Shading Language
3. Execute on single GPU only
Imaging
Video
Physics
AI
© Copyright Khronos Group 2013 - Page 46
Visual Sensor Revolution • Single sensor RGB cameras are just the start of the mobile visual revolution
- IR sensors – LEAP Motion, eye-trackers
• Multi-sensors: Stereo pairs -> Plenoptic array -> Depth cameras
- Stereo pair can enable object scaling and enhanced depth extraction
- Plenoptic Field processing needs FFTs and ray-casting
• Hybrid visual sensing solutions
- Different sensors mixed for different distances and lighting conditions
• GPUs today – more dedicated ISPs tomorrow?
Dual Camera LG Electronics
Plenoptic Array Pelican imaging
Capri Structured Light 3D Camera PrimeSense
© Copyright Khronos Group 2013 - Page 47
OpenVX • Vision Hardware Acceleration Layer
- Enables hardware vendors to implement
accelerated imaging and vision algorithms
- For use by high-level libraries or apps
• Focus on enabling real-time vision
- On mobile and embedded systems
• Diversity of efficient implementations
- From programmable processors, through
GPUs to dedicated hardware pipelines
Open source sample
implementation
Hardware vendor
implementations
OpenCV open
source library
Other higher-level
CV libraries
Application
Dedicated hardware can help make vision
processing performant and low-power enough
for pervasive ‘always-on’ use
© Copyright Khronos Group 2013 - Page 48
OpenVX - Power Efficient Vision Acceleration • Create vision processing graph for power and performance efficiency
- Each Node can be implemented in software or accelerated hardware
- Nodes may be fused by the implementation to eliminate memory transfers
• EGLStreams can provide data and event interop with other APIs
- BUT use of other Khronos APIs are not mandated
• VXU Utility Library provides efficient access to single nodes
- Open source implementation – easy way to start using OpenVX
OpenVX Node
OpenVX Node
OpenVX Node
OpenVX Node
Heterogeneous
Processing
Native
Camera
Control
© Copyright Khronos Group 2013 - Page 49
OpenVX and OpenCV are Complementary
Governance Open Source
Community Driven No formal specification
Formal specification and full conformance tests
Implemented by hardware vendors
Scope Very wide
1000s of functions of imaging and vision Multiple camera APIs/interfaces
Tight focus on hardware accelerated functions for mobile vision Use external camera API
Conformance No Conformance testing
Every vendor implements different subset Full conformance test suite / process
Reliable acceleration platform
Use Case Rapid prototyping Production deployment
Efficiency Memory-based architecture
Each operation reads and writes memory Sub-optimal power / performance
Graph-based execution Optimized nodes and data transfer
Highly efficient
© Copyright Khronos Group 2013 - Page 50
Typical Imaging Pipeline • Pre- and Post-processing can be done on CPU, GPU, DSP…
• ISP controls camera via 3A algorithms
Auto Exposure (AE), Auto White Balance (AWB), Auto Focus (AF)
• ISP may be a separate chip or within Application Processor
Pre-processing Image Signal Processor
(ISP)
Post-
processing
CMOS sensor
Color Filter Array
Lens
Bayer RGB/YUV
App
Lens, sensor, aperture control 3A
Need for advanced
camera control API!
© Copyright Khronos Group 2013 - Page 51
Camera Control API Goals • Provide functional portability for advanced camera applications
- Reduce extreme fragmentation for ISVs wanting more than point and shoot
• Generate image bursts with parameterized camera control and ISP control
- For downstream processing by flexible combination of CPU, GPU and DSP
• Control multiple sensors with multi-sensor synch and alignment
- Stereo pairs, Plenoptic arrays, Depth Cameras
• Enable system-wide sensor time-stamping
- Synchronize MEMS and image sensor samples
• This functionality is not available on any current platform APIs
- Make this API align with future platform direction for easy adoption
© Copyright Khronos Group 2013 - Page 52
Advanced Camera Control Use Cases • High-dynamic range (HDR) and computational flash photography
- High-speed burst with individual frame control over exposure and flash
• Rolling shutter elimination
- High-precision intra-frame synchronization between camera and motion sensor
• HDR Panorama, photo-spheres
- Continuous frame capture with constant exposure and white balance
• Subject isolation and depth detection
• High-speed burst with individual frame control over focus
• Time-of-flight or structured light depth camera processing
- Aligned stacking of data from multiple sensors
• Augmented Reality
- 60Hz, low-latency capture with motion sensor synchronization
- Multiple Region of Interest (ROI) capture
- Multiple sensors for scene scaling
- Detailed feedback on camera operation per frame
© Copyright Khronos Group 2013 - Page 53
Camera API Architecture (FCAM based) • No global state
- State travels with image requests
- Every stage in the pipeline may have different state - -> allows fast, deterministic state changes
• Synchronize devices
- Lens, flash, sound capture, gyro…
- Devices can schedule Actions - E.g. to be triggered on exposure change
- Enables device synchronization
© Copyright Khronos Group 2013 - Page 54
FCam concepts • App requests Shots to a Sensor
• Sensor returns Frames asynchronously
• A Shot specifies capture and post-
processing parameters
• A Frame contains
- Output image data
- Metadata
- Statistics
• Devices (e.g. Lens, Flash) can schedule
Actions (e.g. ‘fire the flash’)
© Copyright Khronos Group 2013 - Page 55
Extensions to FCam model • Timing & Synchronization between cameras and with MEMS sensors
• ISP model (including 3A)
• Multiple cameras
• Multiple ISPs
• Re-entrant ISPs
• Multiple output streams
• Efficient memory allocation
• Streaming rows (not just frames)
• Image types - aligned with MIPI CSI specifications
• Metadata & Statistics
• Regions of Interest
• Vendor extensions – specialized formats and capabilities
© Copyright Khronos Group 2013 - Page 56
Camera API Design Philosophy • C-language API starting from proven designs
- e.g. FCAM, Android camera platform
• Design alignment with widely used hardware standards
- e.g. MIPI CSI
• Focus on mobile, power-limited devices
- But do not preclude other use cases such as automotive, surveillance, DSLR…
• Minimize overlap and maximize interoperability with other Khronos APIs
- But other Khronos APIs are not required
• Provide support for vendor-specific extensions
Apr13
Jul13
Group charter approved
4Q13
Provisional specification
1Q14
First draft specification
2Q14
Sample implementation and
tests
3Q14
Specification ratification
© Copyright Khronos Group 2013 - Page 57
Low Power Environment Scanning • Many sensor use cases would consume too much power to be running 24/7
- Environment aware use cases have to be very low power
• ‘Scanners’ - very low power, always on, detect things in the environment
- Trigger the next level of processing capability
ARM 7 1 MIP and accelerometers can
detect someone in the vicinity
DSP Low power activation of camera
to detect someone in field of view
GPU GPU acceleration for precision
gesture processing
© Copyright Khronos Group 2013 - Page 58
Sensor Industry Fragmentation …
© Copyright Khronos Group 2013 - Page 59
StreamInput Sensor Fusion Stack
OS Sensor OS APIs (E.g. Android SensorManager or
iOS CoreMotion)
Low-level native API defines access to
fused sensor data stream and context-awareness
…
Applications
Sensor Sensor
Sensor
Hub Sensor
Hub
StreamInput implementations
compete on sensor stream quality,
reduced power consumption,
environment triggering and context
detection – enabling sensor
subsystem vendors to increased
ADDED VALUE
Middleware (E.g. Augmented Reality engines,
gaming engines)
Platforms can provide
increased access to
improved sensor data stream
– driving faster, deeper
sensor usage by applications
Middleware engines need platform-
portable access to native, low-level
sensor data stream
Mobile or embedded
platforms without sensor
fusion APIs can provide
direct application access
to StreamInput
Hardware transport
interfaces are defined
by each system, e.g.
IIO or HID sensor
© Copyright Khronos Group 2013 - Page 60
Khronos APIs for Augmented Reality
Advanced Camera Control and stream
generation
3D Rendering and Video
Composition
On GPU
Audio
Rendering
Application
on CPUs, GPUs
and DSPs
Sensor
Fusion
Feature
Tracking
MEMS
Sensors
Camera Control
API
EGLStream Stream frames between APIs
Precision timestamps
on all sensor samples
AR needs not just advanced sensor processing, vision
acceleration, computation and rendering - but also for
all these subsystems to work efficiently together
© Copyright Khronos Group 2013 - Page 61
OS API Adoption
OpenGL ES 2.0 Shipping - Android 2.2
OpenSL ES 1.0 (subset)
Shipping – Android 2.3
OpenMAX AL 1.0 (subset)
Shipping - Android 4.0
EGL 1.4 Shipping under SDK -> NDK
Opera and Firefox WebGL now Chrome soon
OpenGL 3.2 on MacOS
OpenCL 1.2 on MacOS
OpenGL ES 3.0 on iOS
Can enable on MacOS Safari iOS5 enables WebGL for iAds
© Copyright Khronos Group 2013 - Page 62
Leveraging Proven Native APIs into HTML5 • Khronos and W3C liaison
- Leverage proven native API investments into the Web
- Fast API development and deployment
- Designed by the hardware community
- Familiar foundation reduces developer learning curve
Native APIs shipping
or Khronos working group
JavaScript API shipping,
acceleration being developed
or work underway
WebVX? Vision
Processing
WebCAM(!) Camera
control and
video
processing
Possible future
JavaScript APIs or
acceleration
WebStream? Sensor Fusion
Native
JavaScript Canvas
Path Rendering
Camera
Control
HTML
© Copyright Khronos Group 2013 - Page 63
Content
JavaScript, HTML, CSS, ...
WebGL Implementation Anatomy
JavaScript Middleware
HTML5
JavaScript CSS
Browser provides WebGL functionality
alongside other HTML5 technologies
- no plug-in required
OS Provided Drivers. WebGL on Windows
can use Google Angle to create conformant
OpenGL ES 2.0 over DX9
OpenGL ES 2.0 OpenGL
DX9/Angle
Content downloaded from the Web.
Middleware can make WebGL accessible to
non-expert 3D programmers
© Copyright Khronos Group 2013 - Page 64
WebGL Availability in Browsers
- Microsoft – “where you have IE11, you have WebGL – turned on by default and working all the time” - Microsoft - WebGL also enabled for Windows applications - web app framework and web view - Apple - WebGL must be explicitly turned on MAC Safari and only exposed on iOS for iAds - Chrome OS - WebGL is the only cross-platform API to program the GPU - Google IO announcement - Chrome on Android will soon launch with WebGL
Much WebGL content uses three.js library:
http://threejs.org/
© Copyright Khronos Group 2013 - Page 65
C/C++
SDK Dalvik (Java)
Objective C C#
DirectX
HTML/CSS HTML/CSS HTML/CSS
Cross-OS Portability
HTML5 provides cross
platform portability. GPU
accessibility through
WebGL available soon on
~90% mobile systems
Preferred development
environments not
designed for portability
Native code is portable-
but apps must cope with
different available APIs
and libraries
© Copyright Khronos Group 2013 - Page 66
WebGL First Wave Application Categories • Maps and Navigation
• Modeling Tools and Repositories
• Games
• 3D Printing
• Visualization
• Music Videos and Promotion
• Education
• Photo Editors
• Music Visualizers
• Vision/Video Processing
© Copyright Khronos Group 2013 - Page 67
Google Maps • All rendering (2D and 3D) in Google Maps uses WebGL
© Copyright Khronos Group 2013 - Page 68
Microsoft PhotoSynth2 • Demonstrated at Build 2013
http://channel9.msdn.com/Events/Build/2013/4-072 1:50
© Copyright Khronos Group 2013 - Page 69
WebCL – Parallel Computing for the Web • JavaScript bindings to OpenCL APIs
- Enables initiation of Kernels written in OpenCL C within the browser
http://www.youtube.com/user/SamsungSISA#p/a/u/1/9Ttux1A-Nuc
© Copyright Khronos Group 2013 - Page 70
3D Needs a Transmission Format! • Compression and streaming of 3D assets becoming essential
- Mobile and connected devices need access to increasingly large asset databases
• 3D is the last media type to define a compressed format
- 3D is more complex – diverse asset types and use cases
• Needs to be royalty-free
- Avoid an ‘internet video codec war’ scenario
• Eventually enable hardware implementations of successful codecs
- High-performance and low power – but pragmatic adoption strategy is key
Audio Video Images 3D
MP3 H.264 JPEG ? !
An effective and widely adopted codec ignites previously
unimagined opportunities for a media type
© Copyright Khronos Group 2013 - Page 71
glTF Goals • Binary file format for efficient transmission for 3D assets
- Reduce network bandwidth and minimize client processing overhead
• Run-time neutral - DO NOT IMPLY OR MANDATE ANY RUN-TIME BEHAVIOR
- Can be used by any app or run-time – usually WebGL accelerated
• Scalable to handle compression and streaming
- Though baseline format does not include compression
• ‘Direct load efficiency’ for WebGL
- Little or NO processing to drop glTF data into WebGL client
• Carry conditioned data from any authoring format
- Prototyping and optimizing efficient handling of COLLADA assets
A standards-based
content pipeline for
rich native and Web 3D
applications Playback Authoring
© Copyright Khronos Group 2013 - Page 72
COLLADA and glTF Open Source Ecosystem
Tool Interop
Three.js glTF Importer. Rest3D initiative
COLLADA2GLTF
Translator
OpenCOLLADA
Importer/Exporter
and COLLADA
Conformance Tests
On GitHUB
Pervasive WebGL deployment
Other
authoring
formats
Web-based Tools
https://github.com/KhronosGroup/glTF
https://github.com/KhronosGroup/OpenCOLLADA
https://github.com/KhronosGroup/COLLADA-CTS
© Copyright Khronos Group 2013 - Page 73
WebGL as Test-bed for 3D Asset Compression • Integrating and benchmarking 3D geometry compression formats with glTF
- Baseline is GZIP
- Open3DGC - implementation of the MPEG-SC3DMC - Scalable Complexity 3D Mesh Compression codec
- WebGL-loader is Google lightweight compression for WebGL content
Model COLLADA glTF+webgl-loader glTF+Open3DGC ascii glTF+Open3DGC binary
XML gzip raw gzip raw gzip raw •raw bin
•gzip JSON
• utf8:42k
• JSON:12k
• utf8:34k
•JSON:2kb
• ascii:29k
• JSON:11k
• ascii:19k
• JSON:2k
• bin:18k
• JSON:11k
• bin:18k
• JSON:2k
336k 106k 54k 36k 40k 21k 29k 20k
•utf8:8747k
• JSON:753k
•utf8:1325k
• JSON:29k
• ascii:7793k
• JSON:587k
• ascii:1433k
• JSON:29k
• bin:3205k
• JSON:589k
• bin:3205k
• JSON:29k
56763k 7378k 9500k 1354k 8380k 1462k 3794k 3234k
© Copyright Khronos Group 2013 - Page 74
Compression Example Results Overview • Early days – Khronos embarking on methodical analysis using glTF as test-bed
• For mobile - need to balance file size AND decompression processing
- Extensive processing can take more time/power than transmission
• OpenCTM is promising but LZMA is very processor intensive
- Work may lead to LZMA in hardware?
© Copyright Khronos Group 2013 - Page 75
Texture Compression is Key •Texture compression saves precious resources
- Network bandwidth, device memory space AND device memory bandwidth
•Developers need the same texture compression EVERYWHERE - Otherwise portable apps – such as WebGL need multiple copies of same texture
DXTC/S3TC Windows
PVRTC iOS
ETC1 Mandated in
Android Froyo
(400M devices)
ETC2 / EAC MANDATED in
OpenGL ES 3.0
OpenGL 4.3
ASTC OpenGL ES 3.0
and OpenGL 4.3
extensions -> Core
once proven
Pervasive Deployment
Quality
NOT Royalty-free.
Platform
Fragmentation
Royalty-free
BUT only optional in ES.
Only 4bpp | 3 channel
No alpha support
Royalty-free
Backward compatible with ETC1
ETC2: 4bpp | 3 channel
EAC: 4 (8) bpp | 1(2) channel
COMBINED: RGBA 8bpp | 4 channel
Does not have 1-2 bit compression
WITH ALPHA
Royalty-free
Best quality.
Independent control of bit-rate
and # channels
1 to 4 channel
1-8bpp in fine steps
2008-2010 2012-2013 2014->
© Copyright Khronos Group 2013 - Page 76
ASTC – Universal Texture Standard • Adaptive Scalable Texture Compression (ASTC)
- Quality significantly exceeds S3TC or PVRTC at same bit rate
• Industry-leading orthogonal compression rate and format flexibility - 1 to 4 color components: R / RG / RGB / RGBA
- Choice of bit rate: from 8bpp to <1bpp in fine steps
• ASTC is royalty-free and so is available to be universally adopted - Shipping as OpenGL/OpenGL ES extension today for industry feedback
Original
24bpp
ASTC Compression
8bpp 3.56bpp 2bpp
© Copyright Khronos Group 2013 - Page 77
Conclusion • AR is a complex application domain and multiple standards across multiple
domains are needed to enable the market
• Advances in SOC silicon processing and associated APIs are about to enable
Augmented Reality to truly meet user expectations
• Now is a good time to get involved with the standards initiatives
that effect your business
• www.khronos.org
• ntrevett@nvidia.com