NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5....

53
Abhijit Patait, GTC 2020 NVIDIA VIDEO TECHNOLOGIES

Transcript of NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5....

Page 1: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

Abhijit Patait, GTC 2020

NVIDIA VIDEO TECHNOLOGIES

Page 2: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

2

Video/Image Processing on NVIDIA GPUs

NVENC, NVDEC, NvOFA, NvJPEG

Software Updates and New Features

Updates, benchmarks and end-to-end use-cases

Roadmap

Upcoming features

AGENDA

Page 3: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

3

Video/Image Processing on NVIDIA GPUs

Page 4: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

4

NVIDIA VIDEO/IMAGE HARDWARE

NVDEC

NVENC Optical Flow

JPEG decode

CUDA Cores

PCIe

CPU System Memory

Vid

eo M

em

ory

GPU

CUDA Cores

NVDEC – Video Decoder (H.264, H.265, VP9,

MPEG-2…)

NVENC – Video Encoder (H.264, H.265)

Optical Flow (Turing+) – Track pixels

JPEG Decoder (Ampere+ architecture)

Page 5: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

5

SOFTWARE

NVENC NVDECOptical

FlowNvJPEG

NVIDIA Driver

Video Codec SDKOptical

Flow SDK

CUDA

Toolkit

Hardware

Software

All binaries in NVIDIA driver

SDKs

APIs

Reusable samples

Documentation

Linux & Windows

CUDA, DirectX, OpenGL, Vulkan

APIs

Binary backward compatibility

Page 6: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

6

WHAT’S NEW IN GA100 GPUs?

NVDECs

Pascal: 1

Turing: 1-2

GA100: 5

… 1001010111010 …

… 0101100010011 …

… 1001010111010 …

… 0101100010011 …

Scale

Scale

Scale

ScaleHigh-resDecode

1080p, 720p

Infer

Infer

Infer

Infer

Low-res infere.g. 300 × 200

NVDEC 0

NVDEC N

.

.

.

… 1001010111010 …

… 0101100010011 …

… 1001010111010 …

… 0101100010011 …

.

.

.

.

.

.

.

.

.

.

.

.

Not all features will be in all GPUs. Please check NVIDIA developer zone web site for detailed information

Page 7: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

7

WHAT’S NEW IN GA100 GPUs?

NVDECs

Pascal: 1

Turing: 1-2

GA100: 5

Dedicated 5-core JPEG decoder

… 0101100010011 …

… 1001010111010 …

… 0101100010011 …

Scale

Scale

Scale

Scale

Full-res decode

Infer

Infer

Infer

Infer

Low-res infere.g. 300 × 200

JPEG

decoder 1

JPEG

decoder 5

.

.

.… 0101100010011 …

.

.

.

.

.

.

… 1111100010011 …

Not all features will be in all GPUs. Please check NVIDIA developer zone web site for detailed information

Page 8: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

8

WHAT’S NEW IN GA100 GPUs?

NVDECs

Pascal: 1

Turing: 1-2

GA100: 5

Dedicated 5-core JPEG decoder

Improved optical flow engine,

independent of NVENC

NVENC

Optical Flow

Turing Ampere

• Per-pixel flow vector

• Improved quality & perf

• 8192 × 8192

• Region of interest

Not all features will be in all GPUs. Please check NVIDIA developer zone web site for detailed information

Page 9: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

9

SOFTWARE UPDATES

Video Codec SDK 10.0

Optical Flow SDK 2.0

NvJPEG decode (part of CUDA 11.0)

Upcoming SDK Releases

Page 10: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

10

Video Codec SDK Update

Page 11: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

11

VIDEO CODEC SDK

Q1 2018 Q1 2019

2017 Q3 2018 Q3 2019

SDK 8.14K60 HEVC encode

SDK 9.0Turing

HEVC B-frameMulti-NVDEC

Q2 2020

SDK 10.0GA100

NVENC presets 2.0

SDK 8.010-bit transcode

SDK 8.2Decode + inference

optimizations

SDK 9.1True CBR

CUDA/NVDEC parallelism

Page 12: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

12

VIDEO SDK 10.0

GA100 Support (Ampere architecture)

Multi-NVDEC performance improvements

Encoder (NVENC) presets 2.0

June 2020

Page 13: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

13

VIDEO SDK 10.0

GA100 Support (Ampere architecture)

Multi-NVDEC performance improvements

Encoder (NVENC) presets 2.0

Page 14: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

14

83

168

144

129

34

71 68

49

1722 23 22

1621 22 23

12 15 14 16

0

20

40

60

80

100

120

140

160

180

H264 HEVC HEVC 10 VP9

1080p30 s

tream

s

GA100 (A100-42GB)

TU104 (Tesla T4)

GP104 (Tesla P4)

GV100 (Quadro GV100)

GM206 (Tesla M4)

TURING ARCHITECTUREUp to 2 NVDECs per GPU for Decode + Inference

Page 15: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

15

83

168

144

129

34

71 68

49

1722 23 22

1621 22 23

12 15 14 16

0

20

40

60

80

100

120

140

160

180

H264 HEVC HEVC 10 VP9

1080p30 s

tream

s

GA100 (A100-42GB)

TU104 (Tesla T4)

GP104 (Tesla P4)

GV100 (Quadro GV100)

GM206 (Tesla M4)

GA100 WITH AMPERE ARCHITECTURE5 NVDECs per GPU for Decode + Inference

Page 16: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

16

VIDEO SDK 10.0

GA100 (Ampere microarchitecture) Support

Multi-NVDEC

Encoder (NVENC) presets 2.0

Page 17: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

17

NVENC CONFIGURATION

Choose from

6 Presets: HQ, DEFAULT, HP, LLHQ, LL_DEFAULT, LLHP

6 Rate control modes: Constant QP, VBR, CBR, CBR_LOWDELAY_HQ, CBR_HQ, VBR_HQ

Advanced features: B-frames, B-as-reference, Look-ahead, AQ, weighted prediction, VBV…

Use-case specific configuration

Streaming: LL* + CBR_LOWDELAY_HQ + 1-frame VBV

Transcoding: HQ + VBR/CBR + B-frames + B-as-reference + AQ + high VBV

Resolution-dependent quality/performance

Extreme ends of quality/performance difficult to achieve

Video Codec SDK 9.1 and earlier

Page 18: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

18

INTRODUCING NVENC PRESETS 2.0Video Codec SDK 10.0

Choose

Use-case/Tuning Info: High-quality, Low Latency, Ultra Low Latency, Lossless

Preset: P1 (highest performance) to P7 (highest quality)

Rate Control Mode: Constant QP, CBR, VBR

Advanced Features: Built-in; tune only if required

Page 19: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

19

LEGACY VS NEW NVENC PRESETS

Legacy Presets Presets 2.0

One-shot configuration No Yes

Advanced features config Needed Rarely needed

Use-case based No Yes

Performance scales with resolution Sometimes Always

Multi-pass rate control Indirect Direct

Fine granularity of perf vs quality Low High

Rate control modes Complex Easy to understand

Extreme ends of quality/perf trade-off No Yes

Comparison

Page 20: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

20

NVENC PRESETS: NEW VS LEGACYHEVC, High quality (latency tolerant), VBR

New NVENC Presets Legacy NVENC Presets

649

424380

297

230

132 119

651

230

132

0.00%

8.80%14.29%

23.57% 25.86% 27.45% 27.71%

0.00%

25.86%

27.45%

-100%

-80%

-60%

-40%

-20%

0%

20%

40%

0

200

400

600

800

1000

1200

P1 P2 P3 P4 P5 P6 P7 HP Default HQ

Quality

(Bit

rate

savin

gs)

Perf

orm

ance (

fps)

at

1080p

Page 21: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

21

NVENC PRESETS: NEW VS LEGACYH.264, High quality (latency tolerant), VBR

New NVENC Presets Legacy NVENC Presets

634606

390

304

167 162 140

611

406

304

0.00% 0.16%

9.34%

20.60% 21.85%23.78% 24.08%

0.16%

16.36%

20.60%

-60%

-50%

-40%

-30%

-20%

-10%

0%

10%

20%

30%

0

200

400

600

800

1000

1200

P1 P2 P3 P4 P5 P6 P7 HP Default HQ

Bit

rate

savin

gs

Perf

orm

ance (

fps)

at

1080p

Page 22: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

22

NVENC PRESETS: NEW VS LEGACYHEVC, Ultra Low Latency (latency sensitive), CBR

Legacy NVENC PresetsNew NVENC Presets

327

248219

188165 165 164

340

289

188

0.00%

1.93%

4.22%

5.66%

5.96%

5.96%

6.07%

-1.46%

1.00%

5.66%

-12%

-10%

-8%

-6%

-4%

-2%

0%

2%

4%

6%

8%

0

100

200

300

400

500

600

700

800

P1 P2 P3 P4 P5 P6 P7 LL-HP LL-Default LL-HQ

Bit

rate

savin

gs

Perf

orm

ance (

fps)

at

1080p

Page 23: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

23

MIGRATION TO NEW NVENC PRESETS

Strongly recommended to move to new presets

Mapping table in Video Codec SDK 10.0

Backward compatibility – Binary ; Source

Old presets will be removed from API in future SDK

SDK release – June 2020

Video Codec SDK 10.0

Page 24: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

24

Encode (NVENC) Benchmarks

Page 25: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

25

ENCODE BENCHMARKShttps://developer.nvidia.com/nvidia-video-codec-sdk

Page 26: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

26

19

10

19 18

6 52

-7.80%

1.62%

-11.15%-9.87%

-4.48%

0%

8.59%

-30%

-25%

-20%

-15%

-10%

-5%

0%

5%

10%

15%

0

10

20

30

40

50

60

70

T4 fast T4 medium P4 medium P4 slow x264 fast x264 medium x264 slow

Bit

rate

Savin

g (

hig

her

is b

ett

er)

#1080p30 s

tream

s

ENCODE BENCHMARK – H.264Latency Tolerant H.264 Encoding vs x264

Turing NVENC Pascal NVENC CPU

Reference quality

Page 27: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

27

ENCODE BENCHMARK - HEVCLatency tolerant HEVC Encoding vs x265

13

4

1210

2 1 0.85

-21%

-10%

-35%-35%

-8%

0%

9%

-100%

-80%

-60%

-40%

-20%

0%

20%

0

10

20

30

40

50

60

T4 fast T4 medium P4 medium P4 slow x265 fast x265 medium x265 slow

Bit

rate

Savin

g (

hig

her

is b

ett

er)

#1080p30 s

tream

s

Reference quality

Turing NVENC Pascal NVENC CPU

Page 28: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

28

Optical Flow SDK Update

Page 29: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

29

OPTICAL FLOW SDK

Q1 2019

Q3 2019

SDK 1.0Turing

4x4 vectors, 4K

Q2 2020

SDK 2.0GA100

1x1, 2x2, 4x4, 8KRegion of Interest

Object tracker

SDK 1.1OpenCVAccuracy

Page 30: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

30

OPTICAL FLOW SDK 2.0

Ampere architecture GPUs

Improved hardware, independent of NVENC

What’s New?

NVENC

Optical Flow

Turing Ampere

Page 31: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

31

OPTICAL FLOW SDK 2.0

Ampere architecture GPUs

Improved hardware, independent of NVENC

Better performance and higher accuracy than Turing

Up to 300 fps at 4K*

What’s New?

72%

73%

74%

75%

76%

77%

78%

79%

80%

81%

82%

83%

0 200 400 600 800 1000 1200 1400

Quality

* (%

accura

te ±

3 p

ixels

)

Performance (fps for 1080p)

Ampere/Turing

Slow

Medium

Fast

*Performance is dependent on clocks, available memory bandwidth and well-designed application pipelining

Page 32: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

32

OPTICAL FLOW SDK 2.0

Ampere architecture GPUs

Improved hardware, independent of NVENC

Better performance and higher accuracy than Turing

Up to 300 fps at 4K*

Granularity: 1×1, 2×2, 4×4 pixels

What’s New?

4 × 4

2 × 2

1 × 1

*Performance is dependent on clocks, available memory bandwidth and well-designed application pipelining

Page 33: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

33

OPTICAL FLOW SDK 2.0

Ampere architecture GPUs

Improved hardware, independent of NVENC

Better performance and higher accuracy than Turing

Up to 300 fps at 4K*

Granularity: 1×1, 2×2, 4×4 pixels

Resolution up to 8192 × 8192

What’s New?

Turing OF

4096

4096

Ampere OF

8192

8192

*Performance is dependent on clocks, available memory bandwidth and well-designed application pipelining

Page 34: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

34

OPTICAL FLOW SDK 2.0

Ampere architecture GPUs

Improved hardware, independent of NVENC

Better performance and higher accuracy than Turing

Up to 300 fps at 4K*

Granularity: 1×1, 2×2, 4×4 pixels

Resolution up to 8192 × 8192

Flow vectors for region of Interest (ROI)

What’s New?

*Performance is dependent on clocks, available memory bandwidth and well-designed application pipelining

Page 35: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

35

OPTICAL FLOW SDK 2.0

Ampere architecture GPUs

Improved hardware, independent of NVENC

Better performance and higher accuracy than Turing

Up to 300 fps at 4K*

Granularity: 1×1, 2×2, 4×4 pixels

Resolution up to 8192 × 8192

Flow vectors for region of Interest (ROI)

¼ pixel accuracy

Improved confidence (cost)

Optical-Flow based object tracker

What’s New?

*Performance is dependent on clocks, available memory bandwidth and well-designed application pipelining

Page 36: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

36

OPTICAL FLOW SDK 2.0Accuracy vs Performance

72%

73%

74%

75%

76%

77%

78%

79%

80%

81%

82%

83%

0 200 400 600 800 1000 1200 1400

Accura

cy*

(% a

ccura

te ±

3 p

ixels

)

Performance (fps for 1080p)

Ampere/Turing

Slow

Medium

Fast

*KITTI 2015 FL-ALL NOC

Page 37: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

37

OPTICAL FLOW REGION OF INTEREST

Static background, moving foreground objects; e.g. video surveillance

Identify object of interest

Define ROI with extended bounds ±N pixels

Advantages: Improved performance, less noise in flow vectors

Page 38: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

38

Main Functionality

nvOpticalFlowCommon.h

CUDA and DirectX Buffer Management

nvOpticalFlowCuda.h

nvOpticalFlowD3D11.h

Reusable Classes

NvOF.h

NvOFCuda.h

NvOFD3D11.h

NV_OF_STATUS(NVOFAPI* PFNNVOFINIT)

(NvOFHandle hOf, const NV_OF_INIT_PARAMS

*initParams);

NV_OF_STATUS(NVOFAPI* PFNNVOFEXECUTE)

(NvOFHandle hOf, const

NV_OF_EXECUTE_INPUT_PARAMS

*executeInParams,

NV_OF_EXECUTE_OUTPUT_PARAMS

*executeOutParams);

NV_OF_STATUS(NVOFAPI* PFNNVOFDESTROY)

(NvOFHandle hOf);

Page 39: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

39

USE VIA OPENCV

Mat frameL = imread(pathL, IMREAD_GRAYSCALE);

Mat frameR = imread(pathR, IMREAD_GRAYSCALE);

GpuMat d_flowL(frameL), d_flowR(frameR), d_flow;

Mat flowx, flowy, flowxy;

int gpuId = 0;

int width = frameL.size().width, height = frameL.size().height;

Ptr<cuda::FarnebackOpticalFlow> OpticalFlow =

cuda::FarnebackOpticalFlow::create();

OpticalFlow->calc(d_flowL, d_flowR, d_flow);

d_flow.download(flowxy);

Page 40: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

40

USE VIA OPENCV

Mat frameL = imread(pathL, IMREAD_GRAYSCALE);

Mat frameR = imread(pathR, IMREAD_GRAYSCALE);

GpuMat d_flowL(frameL), d_flowR(frameR), d_flow;

Mat flowx, flowy, flowxy;

int gpuId = 0;

int width = frameL.size().width, height = frameL.size().height;

Ptr<cuda::NvidiaOpticalFlow> OpticalFlow =

cuda::NvidiaOpticalFlow::create(perfPreset, width, height, gpuId);

OpticalFlow->calc(frameL, frameR, d_flow);

d_flow.download(flowxy);

Page 41: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

41

OPTICAL FLOW APPLICATIONS

Object tracking

Video frame interpolation

Video frame extrapolation

Action recognition

Page 42: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

42

OPTICAL FLOW APPLICATIONS

Object tracking

Video frame interpolation

Video frame extrapolation

Action recognition

- API and source code in Optical Flow SDK 2.0

Page 43: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

43

OBJECT TRACKINGUsing Feature Tracker or Correlation Filter

.

.

.

NVDEC 1

NVDEC 2

NVDEC N

.

.

.

Detector M

Camera 1

Camera M

Stream 1

Stream M

Detector 1

Object

Tracker 2

Object

Tracker M

.

.

.

Object

Tracker 1

Feature Tracker

.

.

.

.

.

.

1

M

M

1

.

.

.

Runs

every

Kth

fram

e

.

.

.

Object classes, IDs and positions

Runs

every

fra

me

Page 44: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

44

OBJECT TRACKING WITH ZERO GPU USAGEOptical Flow is “Free”!!

.

.

.

NVDEC 1

NVDEC 2

NVDEC N

.

.

.

Detector M

Camera 1

Camera M

Stream 1

Stream M

Detector 1

Object

Tracker 2

Object

Tracker M

.

.

.

Object

Tracker 1

NVIDIA Optical

Flow Hardware

.

.

.

.

.

.

1

M

M

1

.

.

.

Runs

every

Kth

fram

e

.

.

.

Object classes, IDs and positions

Runs

every

fra

me

Page 45: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

45

OBJECT TRACKER ALGORITHMNvOFTracker in Optical Flow SDK 2.0

Decode

(NVDEC)

Detect

(yolov3)

Frames

Tracker

(CPU/

CUDA)

Regions of Interest

NvOF

Hardware

FlowVectors

Object IDs

Object Positions

Object Classes

Decode video bitstream to frames, P-2, P-1, P (current)

Detect objects of interest (person, bicycles, cars, …)

Define regions of interest to track

Runs every Kth frame

Optical Flow (P, P-1)

Dense flow field ➔ Representative flow

Warp object position

Tracker minimizes cost

Cost = f(Centroid distance, IOU, OF cost, …)

Page 46: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

46

NvOFTracker.h

C-API

• NvOFTCreate

• NvOFTProcess

• NvOFTDestroy

COFTracker.h

C++ Reusable Class

• COFTracker::COFTracker

• COFTracker::InitTracker

• COFTracker:: TrackObjects

NV OBJECT TRACKER APISource code in Optical Flow SDK 2.0

Page 47: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

47

Sample Code

int main(int argc, char** argv) {

// Instantiate a video decoder

cv::VideoCapture videoIn(clFields.inputFile, cv::CAP_FFMPEG);

// Initialize and instantiate object detector

CDetector objectDetector(detectorEngineFile, gpuId);

// Instantiate NvOFT Object tracker

COFTracker objectTracker(w, dh, NvOFT_SURFACE_MEM_TYPE_SYSTEM,

NvOFT_SURFACE_FORMAT_Y, gpuId);

while (frame_available) {

// Read the next video frame

cv::Mat frame; videoIn >> frame;

if (nFrame % N == 0)

// Run detector every Nth frame and get bounding boxes

boxes = objectDetector.Run(frame.data, frameProperties);

else

// Track detected objects in every frame

trackedObjects = objectTracker.TrackObjects(frame.data,

frameSize, frame.step[0], inputObjectsVector, FALSE);

DrawTrackedObjects(…)

}

videoIn.release();

}

Page 48: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

48

Contents

OBJECT TRACKER IN OPTICAL FLOW SDK

Source code for end-to-end pipeline with

Video decoder

Object detector (non-NVIDIA, yoloV3)

Object tracker based on NVIDIA optical flow

API for easy integration

Sample application and documentation

Integrable with NVIDIA DeepStream SDK

Page 49: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

49

OBJECT TRACKER PROFILING

Page 50: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

50

Roadmap

Page 51: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

51

Roadmap

Video Codec SDK 11.0 (Q3 2020)

Deprecation of presets

Samples moved to GitHub, online documentation

DirectX 12 and Vulkan support via sample apps

Optical Flow SDK 3.0 (Q3 2020)

Ampere architecture GPUs

Object detector & tracker

Frame rate interpolation/extrapolation

Optical vector pre/post-processing APIs

Page 52: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores

52

RESOURCES

Video Codec SDK: https://developer.nvidia.com/nvidia-video-codec-sdk

Optical Flow SDK: https://developer.nvidia.com/nvidia-video-codec-sdk

Video Technologies Developer Forum: https://devtalk.nvidia.com/default/board/175/video-technologies/

Support: [email protected]

CWE21120: How to Use Video Codec and Optical Flow SDK on NVIDIA GPUs Effectively

Page 53: NVIDIA VIDEO TECHNOLOGIESdeveloper.download.nvidia.com/video/gputechconf/gtc/2020/... · 2020. 5. 19. · 4 NVIDIA VIDEO/IMAGE HARDWARE NVDEC NVENC Optical Flow JPEG decode CUDA Cores