BING: Binarized Normed Gradients for Objectness Estimation ...

54
自适应视觉感知技术与应用 程明明, http://mmcheng.net 1 自适应视觉感知技术 程明明 南开大学计算机学院

Transcript of BING: Binarized Normed Gradients for Objectness Estimation ...

Page 1: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 1

自适应视觉感知技术

程明明

南开大学计算机学院

Page 2: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 2

人类50+%神经元用于视觉信息处理

Image Credit: https://badremuneer.in/the-colours-of-light-green-is-best-for-brain-eyes-health/

人类的大脑中大约有一千亿个神经元,比最强超算还强

Page 3: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 3

视觉感知技术面临的挑战

大小各异、形状复杂、环境多变、类别众多怎样用有限的计算资源去理解无限复杂的真实世界?

Page 4: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 4

相关论文

Res2Net: A New Multi-scale Backbone Architecture

• IEEE TPAMI 2020

Nonlinear Regression via Deep Negative Correlation Learning

• IEEE TPAMI 2020

Strip Pooling: Rethinking Spatial Pooling for Scene Parsing

• IEEE CVPR 2020

Page 5: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 5

计算机视觉的发展多尺度视角

AlexNet (NIPS’12)

SIFT (ICCV’99)

62557次引用

56729次引用

Page 6: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 6

深度神经网络发展多尺度视角

VggNet (ICLR’15)

37951次引用

Page 7: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 7

深度神经网络发展多尺度视角

ResNet(CVPR’16 Best Paper)

DenseNet(CVPR’17 Best

Paper)

46432次引用

Page 8: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 8

CNN卷积、激活、池化

卷积?

Page 9: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 9

富尺度空间的深度神经网络通用架构

Bottleneck block Res2Net moduleRes2Net: A New Multi-scale Backbone Architecture, IEEE TPAMI, 2020.

Page 10: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 13

富尺度空间的深度神经网络通用架构

•应用1:图像分类 (Res2Net-v1b)

Backbone Params GFLOPs top-1 err. top-5 err.

ResNet-101 44.6 M 7.8 22.63 6.44

ResNeXt-101-64x4d 83.5M 15.5 20.40 -

HRNetV2p-W48 77.5M 16.1 20.70 5.50

Res2Net-v1b-50 25.23M 4.5 19.73 4.96

Res2Net-v1b-101 45.2M 8.3 18.77 4.64

与商汤和港中文开源物体检测库上的主流模型比较

Page 11: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 15

富尺度空间的深度神经网络通用架构

•应用2:物体检测

• Faster R-CNN, MS-COCO

• https://github.com/Res2Net/mmdetection

Backbone Params. GFLOPs box AP

R-101-FPN 60.52M 283.14 39.4

X-101-64x4d-FPN 99.25M 440.36 41.3

HRNetV2p-W48 83.36M 459.66 41.5

Res2Net-v1b-101 61.18M 293.68 42.3

Page 12: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 17

富尺度空间的深度神经网络通用架构

•应用3:Class Activation Mapping

Page 13: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 18

富尺度空间的深度神经网络通用架构

•应用4:显著性物体 (分割)

Images. GT. ResNet-50. Res2Net-50

Page 14: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 19

富尺度空间的深度神经网络通用架构

•应用4:显著性物体 PoolNet (CVPR 2019)

• https://github.com/Res2Net/Res2Net-PoolNet

0.85

0.87

0.89

0.91

0.93

0.95

ECSSD PASCAL-S HKU-IS SOD DUTS-TE

VGG

ResNet50

Res2Net50

Page 15: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 20

富尺度空间的深度神经网络通用架构

•应用5:语义分割 (Deeplab v3+)

Page 16: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 21

富尺度空间的深度神经网络通用架构

•应用5:语义分割 (PASCAL VOC12 val set)

Page 17: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 22

富尺度空间的深度神经网络通用架构

•应用6:实例分割

• Mask-RCNN, MS-COCO

Page 18: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 23

富尺度空间的深度神经网络通用架构

•应用6:实例分割

• Mask-RCNN, MS-COCO

• https://github.com/Res2Net/mmdetection

Backbone Params. GFLOPs box AP mask AP

R-101-FPN 63.17M 351.65 40.3 36.5

X-101-64x4d-FPN 101.9M 508.87 42.0 37.7

HRNetV2p-W48 86.01M 528.17 42.9 38.3

Res2Net-101 63.83M 362.18 43.3 38.6

Page 19: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 25

富尺度空间的深度神经网络通用架构

•应用7: 关键点估计 (COCO 2017)

Page 20: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 26

富尺度空间的深度神经网络通用架构

•应用7: 关键点估计 (COCO 2017)

• https://github.com/Res2Net/Res2Net-Pose-Estimation

• Key-point method: SimpleBaseline [Xiao et. al., ECCV'18].

0.724

0.697

0.765

0.737

0.708

0.782

0.743

0.713

0.792

AP AP (M) AP (L)

ResNet_50 Res2Net_50

Res2Net_v1b_50

Page 21: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 27

富尺度空间的深度神经网络通用架构

•应用8:交互式分割 (Lin et al. CVPR’20)

Page 22: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 28

富尺度空间的深度神经网络通用架构

•应用8:交互式分割 (Lin et al. CVPR’20)

Page 23: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 29

富尺度空间的深度神经网络通用架构

•应用9:全景分割(Detectron2)

Page 24: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 30

富尺度空间的深度神经网络通用架构

•应用9:全景分割 (Detectron2, MS-COCO)

Name Train mem (GB) Box AP Mask AP PQ

R50-FPN 4.8 40.0 36.5 41.5

R101-FPN 6.0 42.4 38.5 43.0

Res2Net101-FPN 6.0 44.0 39.6 44.5

Detectron2 is Facebook AI Research's nextgeneration software system that implementsstate-of-the-art object detection algorithms.

Page 25: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 31

富尺度空间的深度神经网络通用架构

•其他应用:https://mmcheng.net/res2net/

矢量化道路检测

行人重识别 深度估计

CT影像肿瘤分割

Page 26: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 32

人工设计 vs. NAS

•限定搜索空间、硬件适配难

Tested on GTX 1080Ti

Page 27: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 33

CNN卷积、激活、池化

池化?

Page 28: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 34

Non-local/anisotropy context

即需要细节又需要捕捉全局信息

Page 29: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 35

Non-local context information

Non-local neural networks, CVPR 2018.

Attention to scale: Scale-aware semantic image segmentation, CVPR 2016.

Non-local modules Self-attention

Compute large affinity matrix use huge resources!

Page 30: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 36

Non-local context information

Non-local neural networks, CVPR 2018.

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and FullyConnected CRFs, PAMI 2018.

Dilated convolution Pyramid/global pooling

Incapable of anisotropy context!

Page 31: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 37

带状池化

Strip Pooling: Rethinking Spatial Pooling for Scene Parsing, IEEE CVPR, 2020.

Page 32: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 38

Strip Pooling (SP) 模块

Long range connection along one direction, local context along the other direction

Page 33: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 40

Visualization

LRD/SRD: long/short range dependency aggregation. MPM: mixed pooling module.

Page 34: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 41

Results

Image GT Results

Page 35: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 42

Results on ADE20K

Page 36: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 45

ILSVRC 2016

Ensemble vs. Single Model

Page 37: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 47

基于深度负相关学习的鲁棒回归

•回归 (regression): 输入→相关输出

•稠密人群计数、年龄估计、图像超分辨率...

•现有主流方法

•设计鲁棒的损失函数

• Single hypothesis能力不足

•集成学习(Esemble learning, EL)

•多模型集成→参数量大→应用较少

Robust Regression via Deep Negative Correlation Learning, IEEE TPAMI, 2020.

Page 38: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 48

基于深度负相关学习的鲁棒回归

•负相关学习(Negative Correlation Learning)

•集成学习(Esenmble learning, EL)

•系统控制子模型的bias-variance-covariance

• DNCL (Deep Negative Correlation Learning)

•不额外增加参数

Page 39: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 49

基于深度负相关学习的鲁棒回归

•对于一个映射G:𝑿 → 𝒀, 损失函数为

•假设集成模型是有多个子模型平均得到

Page 40: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 50

基于深度负相关学习的鲁棒回归

• DNCL (Deep Negative Correlation Learning)

•系统地控制Bias-variance-covariance

Page 41: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 51

基于深度负相关学习的鲁棒回归

•每个子模型 Accurate & “diversified”

•不增加计算量

•利用Group-conv对顶层特征进行分块实现

Page 42: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 52

基于深度负相关学习的鲁棒回归

•理论证明

•集成模型的误差≤子模型平均误差

• Less Rademacher complexity→ 易优化

Page 43: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 53

基于深度负相关学习的鲁棒回归

•应用1:人群计数

Page 44: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 55

基于深度负相关学习的鲁棒回归

•应用2:性格分析

Page 45: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 56

基于深度负相关学习的鲁棒回归

•应用2:性格分析

Comparison of the properties of the proposed method vs. the top teams in the ChaLearn First Impressions Challenge.

Page 46: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 57

基于深度负相关学习的鲁棒回归

•应用3:年龄估计

Page 47: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 58

基于深度负相关学习的鲁棒回归

•应用4:超分辨率

Page 48: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 59

基于深度负相关学习的鲁棒回归

•应用4:超分辨率

Page 49: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 60

技术报告总结

•分层递进残差网络→富尺度特征提取

•图像分类

•物体检测

•激活图预测

•显著性检测

•语义分割

•实例分割

•关键点估计

•交互式分割

•深度负相关学习

→单模型算力、多模型效果

•人群计数

•年龄估计

•性格分析

•超分辨率

•自适应池化

•各向异性全局信息

•语义分割SOTA

Page 50: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 61

新冠肺炎CT影像AI辅助诊断

•在国内外50+家医院使用:↓时间、↑准确率

•截至3月26日,已为15.3万疑似患者服务

•系统已经应用于美、意、日、俄等

Page 51: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 62

案例

Page 52: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 63

Tuberculosis Diagnosis

Rethinking Computer-aided Tuberculosis Diagnosis, CVPR 2020.

Page 53: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 64

Tuberculosis Diagnosis

Human Study by Radiologists: accuracy is 68.7%, and 84.8% (no distinguish between active and latent TB)

Page 54: BING: Binarized Normed Gradients for Objectness Estimation ...

自适应视觉感知技术与应用程明明, http://mmcheng.net 65

谢谢!