Post on 26-Aug-2020
Optimizing Film, Media with OpenCL
& Intel Quick Sync Video
Petter Larsson, Senior Software Engineer
Ryan Tabrah, Product Manager
SIGGRAPH 2012
SIGGRAPH 2012
The Intel® Vision Enriching the lives of every person on earth through technology
Visual Tools for Developers
SIGGRAPH 2012
Both Available FREE of charge…
High performance GPU acceleration for complete media pipelines
Seamless surface sharing between Media and OpenCL context
Effortless fallback on CPU processing for legacy platforms
Complete GPU workload analysis via Intel Graphics Performance Analyzer
Intel SDK for OpenCL Applications 2012 is a comprehensive development environment for OpenCL applications
An open standard compute model: Enables applications with cross architecture functional portability on 3rd Generation Intel Core™ processor–based platforms
Intel Media SDK 2012 is a great way to optimize applications to utilize the power of Intel Quick Sync video
Hardware accelerated video encoding, decoding, and transcoding: Fully utilize the power of Intel Core HD Graphics
Extend programmability options on Intel platforms: Augments Intel’s developer choice of programming tools on Intel platforms
Visual Tools for Developers
SIGGRAPH 2012
Here’s where you would expect a roadmap…. You don’t need one. - API future proofs your software – code NOW and optimize for tomorrow’s platforms - Deliver your products on your own timeline - Single interface for all compute devices - Span across OS and platform versions Deliver the best, most efficient user experience to your customers, utilizing the full power of Intel Core CPU and HD Graphics technology.
Intel Quick Sync Video & OpenCL*: The Speed You Need
Live Demo with Sony Movie Studio 12
SIGGRAPH 2012
Media Conversion With Intel Quick Sync Video & OpenCL* Hardware Acceleration
Live Demo
SIGGRAPH 2012
Graphics and Media Interoperability with
OpenCL* APIs Extensions Intel HD
Graphics
support
CPU
Device
DirectX<->OpenCL cl_khr_d3d10_sharing
OpenGL<->OpenCL cl_khr_gl_sharing
cl_khr_gl_event
DirectX Video Acceleration
(DXVA) <->OpenCL
cl_intel_dx9_media_sharing
Intel Media SDK<->OpenCL cl_intel_dx9_media_sharing
Interoperability with Intel Media SDK, DirectX* and OpenGL* APIs allow OpenCL developers to better utilize platform resources on graphics tasks
SDK Interoperability Sample/Demo - using cl_intel_dx9_media_sharing extension
SIGGRAPH 2012
Video stream
•H.264(AVC)
•MPEG2
•VC1
•MJPEG
Media SDK video decode
•Decode to D3D NV12 surfaces
•DirectXVideo DecoderService surfaces
Open CL frame processing
•Color effect (NV12)
•Water ripples (NV12)
•Twirl (RGB)
•Flip (RGB)
Render
•Rendered to window or full screen via VideoProcessBlt
Shared surface Shared surface
GPU accelerated
Setup buffers:
clCreateFromDX9MediaSurfaceIntel
Processing:
1. clSetKernelArg
2. clEnqueueAcquireDX9ObjectsINTEL
3. clEnqueueNDRangeKernel/clEnqueueTask
4. clEnqueueReleaseDX9ObjectsINTEL
5. clFlush
SDK Interoperability Sample/Demo - integration specifics
• Code based on Intel Media SDK “sample_decode”
– Includes common file access, memory and device mgmt functions
• OpenCL processing class
– Handles OpenCL device setup/teardown and frame processing
– OpenCL processing applied on D3D surface before rendering
– Integrated with the “sample_decode” renderer class
• Features and effects selected via keyboard input
Next… Demo… Quick code walkthrough…
SDK Interoperability Sample Code walkthrough
Code “fly-by” will showcase the following
• How to setup OpenCL environment with surface sharing specifics – DX9_MEDIA_SHARING / “cl_ext.h”
– Check for “cl_intel_dx9_media_sharing” extension availability
– Hook up OCL D3D sharing extensions
• How to setup Media SDK sessions and basic decode process
• DXVA surface allocation - How to create shared handles
• Media SDK/OpenCL - Key integration points
• Open CL kernel code for the demo effects
SIGGRAPH 2012
SDK Interoperability Sample/Demo - performance
• Demo Benchmark : 1440x1088 AVC video clip
• Analysis of GPU performance using
Intel Graphics Performance Analyzer (GPA)
SIGGRAPH 2012
workload fps CPU (%) GPU EU(%) GPU Decode(%)
HW decode + OCL color effect 225 28 95 20
HW decode (30 fps) + OCL color effect 30 4 45 6
SW decode + OCL color effect 110 95 65 0
SW decode (30 fps) + OCL color effect 30 25 40 0
CPU GPU accelerated
What’s in the future?
• DirectX 11
– Unifies API for both video and 3D content
– ID3D11VideoDevice: Decode directly tied to DX11 / DXGI
– Direct Flip : Save a memory copy during playback of a video frame
• Open CL 1.2
– DX11 buffer sharing extension (cl_khr_d3d11_sharing)
– cl_intel_dx9_media_sharing promoted to cl_khr_dx9_media_sharing
• GPA improvements
– DX11 support
SIGGRAPH 2012
THANK YOU!
• Download these tools now for free:
http://intel.com/software/vcsource
• Follow us on Twitter:
@IntelMediaSDK
@IntelOpenCL
@IntelVCDev
ryan.tabrah@intel.com
SIGGRAPH 2012
Accelerate Visual Development Faster
SIGGRAPH 2012
SDK Interoperability Resources
• Open CL plug-in sample : Simple rotate kernel (not using shared surfaces) Media SDK
• ResourceSharing sample: D3D10 buffer & DXVA surface sharing
• MediaSDKInterop sample: Media SDK plug-in; Open CL post processing effects
OpenCL SDK
• Media SDK decode – Open CL post processing Session demo
SIGGRAPH 2012
Additional Resources
• Collecting OpenCL*-related Metrics with Intel® Graphics
Performance Analyzers link
• Using Intel® Graphics Performance Analyzer (GPA) to analyze
Intel® Media Software Development Kit-enabled applications link
• Performance Interactions of OpenCL* Code and Intel® Quick
Sync Video on Intel® HD Graphics 4000 link
• Forums
– Intel Media SDK: link
– Intel Open CL SDK: link
SIGGRAPH 2012
Legal Disclaimer and Optimization Notice
SIGGRAPH 2012
INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Copyright © , Intel Corporation. All rights reserved. Intel, the Intel logo, Xeon, Core, VTune, and Cilk are trademarks of Intel Corporation in the U.S. and other countries.
Optimization Notice
Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804