High Performance GPU Video Encoding | GTC...
Transcript of High Performance GPU Video Encoding | GTC...
![Page 2: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/2.jpg)
Agenda
� Why GPU Video Encoding
� NVIDIA H.264 Video Encoding Solutions
� Hardware Architecture
� Software Architecture
� Performance
![Page 3: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/3.jpg)
Why GPU Video Encoding?
![Page 4: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/4.jpg)
Benefits of Encoding on GPU
� Low power
— Fixed function hardware
— Reduced memory transfers
� Low latency
� High performance
� Higher density
— 2x channel density @ ~50% power consumption
� Scalability
![Page 5: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/5.jpg)
Cloud Streaming – Encode on CPU
AppTextures & vertices in sys-mem
Textures & vertices in vid-mem
Render & Present
Captured image in vid-mem
Transfer image to sys-mem
EncodePacketize
& transmit
CPU GPU CPU
Power, LatencyLarge memory transfers
PowerCPU-intensive tasks
Cost/seat , channel density CPU-intensive tasks
![Page 6: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/6.jpg)
Cloud Streaming – Encode on CPU
Power, LatencyLarge memory transfers
PowerCPU-intensive tasks
Limited scalabilityFixed number of CPUs
AppTextures & vertices in sys-mem
Textures & vertices in vid-mem
Render & Present
Captured image in vid-mem
Transfer image to sys-mem
EncodePacketize
& transmit
Transfer image to sys-mem
EncodeApp
Textures & vertices in sys-mem
Textures & vertices in vid-mem
Render & Present
Captured image in vid-mem
Transfer image to sys-mem
EncodePacketize
& transmit
Transfer image to sys-mem
EncodeApp
Textures & vertices in sys-mem
Textures & vertices in vid-mem
Render & Present
Captured image in vid-mem
Transfer image to sys-mem
EncodePacketize
& transmit
Transfer image to sys-mem
EncodeApp
Textures & vertices in sys-mem
Textures & vertices in vid-mem
Render & Present
Captured image in vid-mem
Transfer image to sys-mem
EncodePacketize
& transmit
Transfer image to sys-mem
Encode
? ? ? ? ? ? ? ?
Cost/seat , channel density CPU-intensive tasks
![Page 7: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/7.jpg)
AppTextures & vertices in sys-mem
Textures & vertices in vid-mem
Render & Present
Captured image in vid-mem
Packetize &
transmit
CPU
Low-power, low-latencyNo large memory transfers
Low powerUse CPU only where needed
Cost/seat , channel density Use CPU only where needed
Encode
CPUGPU
Cloud Streaming – Encode on GPU
![Page 8: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/8.jpg)
Cloud Streaming – Encode on GPU
AppTextures & vertices in sys-mem
Textures & vertices in vid-mem
Render & Present
Captured image in vid-mem
Packetize &
transmitEncode
Low-power, low-latencyNo large memory transfers
Low powerUse CPU only where needed
Cost/seat , channel density Use CPU only where needed
EncodeTextures & vertices in vid-mem
Render & Present
Captured image in vid-mem
EncodeEncodeTextures & vertices in vid-mem
Render & Present
Captured image in vid-mem
EncodeEncodeTextures & vertices in vid-mem
Render & Present
Captured image in vid-mem
EncodeEncodeTextures & vertices in vid-mem
Render & Present
Captured image in vid-mem
EncodeEncodeTextures & vertices in vid-mem
Render & Present
Captured image in vid-mem
EncodeEncodeTextures & vertices in vid-mem
Render & Present
Captured image in vid-mem
EncodeEncodeTextures & vertices in vid-mem
Render & Present
Captured image in vid-mem
EncodeEncode
Excellent scalabilityAdd GPUs as needed
![Page 9: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/9.jpg)
NVIDIA H.264 Video Encoding Solutions
![Page 10: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/10.jpg)
NVIDIA H.264 Video Encoding SolutionsCUDA E
ncodin
g • Hybrid processing (CPU + CUDA)
• ME, intra-prediction, mode decision in CUDA
• VLE on CPU
• Performance scales with CUDA cores
• Works on all GPUs (Tesla, Fermi, Kepler, …)
NVENC • Fully hardware
accelerated
• ME, intra-prediction, mode decision, VLE
• High performance, low power
• Kepler+ GPUs
![Page 11: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/11.jpg)
NVIDIA H.264 Video Encoding Solutions
• Distributed with CUDA SDK libraries
• No low-latency streaming
• All Platforms –GeForce, Quadro, Tesla, GRID
• Windows only
• Proprietary software API
• Optimized for low-latency streaming
• Better visual quality
• Quadro, GRID and Tesla
• Windows & Linux
![Page 12: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/12.jpg)
Power vs. Performance
8
1
2
3
4
5
6
7
Performance n
×× ××HD
Power (Watts)
NVENC (HQ)
NVENC (HP)
CUDA (GK107)
CUDA (GK104)
CUDA (GF110)
CUDA (GF104)
![Page 13: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/13.jpg)
NVENC FeaturesFeature What it enables
H.264 base, main, high profiles Wide range of use-cases
Up to 8x HD encode (1080p @ 240 fps) Faster than real-time encoding
Flexible ME, QP maps Customizable quality, region of interest encoding
YUV 4:2:0 and planar 4:4:4 support High quality encoding without chroma subsampling
MVC Full resolution stereo encode
Up to 4096 × 4096 in HW High resolution encode
API NVENC SDK (Flexible API, Win/Linux, x86)GRID SDK (Capture+ Encode, Win - now, Linux -future)
NVENC and CUDA parallelism Simultaneous and parallel HW and CUDA encoding for increased performance
![Page 14: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/14.jpg)
NVENC Hardware Architecture
![Page 15: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/15.jpg)
NVENC Arch: Microcontroller
Microcontroller
DMA Controller
Motion Estimation Mode
Decision
Intrasearch
& recon loop
EntropyCoding
Video memory (FB) interface
Memory
Host • NVIDIA proprietary
microcontroller
• Runs firmware
• Programs encoder
blocks
• Rate control
![Page 16: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/16.jpg)
NVENC Arch: Motion Estimation
Microcontroller
DMA Controller
Motion Estimation Mode
Decision
Intrasearch
& recon loop
EntropyCoding
Video memory (FB) interface
Memory
Host • Exhaustive full-pel
search (L0, L1, Bi)
• Temporal
• Spatial
• Coloc
• Constant
• External
• Half-pel and
quarter-pel
refinement
• Motion
compensation
![Page 17: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/17.jpg)
NVENC Arch: Mode Decision
Microcontroller
DMA Controller
Motion Estimation Mode
Decision
Intrasearch
& recon loop
EntropyCoding
Video memory (FB) interface
Memory
Host • Calculates inter-MB
cost
• Compares to intra-
MB cost and decides
final winner
![Page 18: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/18.jpg)
NVENC Arch: Microcontroller
Microcontroller
DMA Controller
Motion Estimation Mode
Decision
Intrasearch
& recon loop
EntropyCoding
Video memory (FB) interface
Memory
Host • H.264 intra search
• Forward DCT &
quantization
• Recon-loop (IDCT,
IQT, deblocking)
![Page 19: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/19.jpg)
NVENC Arch: Microcontroller
Microcontroller
DMA Controller
Motion Estimation Mode
Decision
Intrasearch
& recon loop
EntropyCoding
Video memory (FB) interface
Memory
Host
• CAVLC & CABAC
![Page 20: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/20.jpg)
NVENC Software Architecture
![Page 21: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/21.jpg)
Using NVENCNVENC SDK • No capture
• Transcoding
• Archiving
• Video editing
• CUDA + encoding
• D3D, CUDA interop
• Exhaustive encoder settings
GRID
SDK • Capture + encode
• Optimized for low-latency apps
• Limited encoder settings
Direct
Encode
Capture +
Encode
![Page 22: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/22.jpg)
Direct NVENC Encode (NVENC SDK)
Client application
NVENC API
NVENC
Driver
DirectX
Driver
CUDA
Driver
NVENC firmware + hardware
Initialize, Configure, Encode
Configure HW
HW Encode
Encoded
bitstream
![Page 23: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/23.jpg)
Capture and Encode (GRID SDK)
Client application
GRID SDK
NVENC
Driver
DirectX
Driver
NVENC Hardware
Capture
YUV
GPU 3D Engine
DX/OGL Present
Encode
Encoded
Bitstream
![Page 24: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/24.jpg)
NVENC SDK
� Available on NVIDIA developer zone
— https://developer.nvidia.com/nvidia-video-codec-sdk
� .DLL/.so, interface header, documentation, sample apps
� Unified API for Windows and Linux
� Works on x86/x64
� Various presets and API’s for
— Transcoding
— Video conferencing
— Remote graphics (Cloud gaming, remote desktop, capture & stream)
� Supports CBR, VBR rate control
![Page 25: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/25.jpg)
NVENC SDK (Contd.)
� Advanced features
— Dynamic resolution change
— Dynamic bitrate change
— Reference picture invalidation
— Temporal SVC
— Intra-refresh
— Two-pass rate control for constant quality
![Page 26: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/26.jpg)
GRID SDK Encode
� Licensed from NVIDIA
� .DLL/.so, interface header, documentation, sample apps
� Windows (now) and Linux (future)
� Works on x86/x64
� Various presets and API’s for
— Remote graphics (Cloud gaming, remote desktop, capture & stream)
� Optimized for low latency and high quality
![Page 27: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/27.jpg)
NVENC Performance
Preset Video 1080p Gaming 1080p Gaming 720p Gaming 720p
HP 223.21 fps 216.45 fps 483.09 fps 485.44 fps
HQ (With 1-B frame) 116.69 fps 122.55 fps 263.85 fps 290.70 fps
HQ (No B-frames) 144.30 fps 130.72 fps 311.53 fps 336.70 fps
0.00 fps
100.00 fps
200.00 fps
300.00 fps
400.00 fps
500.00 fps
600.00 fps
Video 1080p Gaming 1080p Gaming 720p Gaming 720p
HP
HQ (With 1-B frame)
HQ (No B-frames)
![Page 28: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/28.jpg)
Performance (n simultaneous HD encodes)
0 fps
20 fps
40 fps
60 fps
80 fps
100 fps
120 fps
140 fps
160 fps
gijoe_highmotion.yuv (VC
IPP)
gijoe_highmotion.yuv (VC
IBP)
ghost_rider_long.yuv (VC IPP)ghost_rider_long.yuv (VC IBP)
Target perf
N/A
1 contexts
2 contexts
3 contexts
4 contexts
5 contexts
6 contexts
7 contexts
8 contexts
9 contexts
10 contexts
• Encode bit rate = 30 Mbps• Performance shown with more than 1 context is the sum of fps obtained from each context
![Page 29: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/29.jpg)
Gaming Sequence (720p) – 5 Mbps
Target
Actual
0
2
4
6
8
10
1 101 201 301
![Page 30: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/30.jpg)
0
5
10
15
1 101 201 301 401 501 601 701
Gaming Sequence (720p) – 10 Mbps
Target
Actual
![Page 31: High Performance GPU Video Encoding | GTC 2013on-demand.gputechconf.com/...High-Performance-GPU-Video-Encoding.pdf · Agenda Why GPU Video Encoding NVIDIA H.264 Video Encoding Solutions](https://reader031.fdocuments.in/reader031/viewer/2022021715/5c2c934009d3f292178d7796/html5/thumbnails/31.jpg)
Questions?