3. Image Ranking: 5. Monocular Depth Estimation on KITTI · We learn the two continuous domain line...

1. ObjectiveOur goal is to substitute the 1-hot label representations in ordinalregression problems. Instead of building an ensemble of K-1 binaryclassifiers that distinguish rank thresholds, we use the same default,fully-connected output layer of an off-the-shelf classification CNN witha K-element soft label:

5. Monocular Depth Estimation on KITTISORD can estimate depth using segmentation networks, adapting ascale-increasing discretization method (SID) with K=120 intervals:

2. MethodOur Soft Ordinal labels (SORD) encode metric penalties à la Softmax :

• is the list of ordinal ranks in the problem.

• penalizes how far a given rank is from the true value .

When using Softmax and a cross-entropy loss, the computation of thegradient with respect to the output probits is simply:

SORD trains the network to learn features so that the outputs foreach rank match the interclass distance , reaching the minima when:

SORD Features:

- General-purpose: a wild variety of tasks can use SORD.

- Simple to implement: just two lines of code.

- No new CNN architectures: an off-the-shelf network is sufficient.

- Inference can be done with either argmax or expected value.

- Continuous domains: is valid when . This smoothly balancesthe SORD label without needing discretization or quantization.

3. Image Ranking: inputs with equally spaced labelsWe use an off-the-shelf VGG-16 and the metric:

Image Aesthetics dataset: 15K images labeled in 5 ordinal categories

Adience dataset: 26K images labeled in 8 age groups

1 ‐ Unacceptable 2 ‐ Flawed 3 ‐ Ordinary 4 ‐ Professional 5 ‐ Exceptional

1 – (0-2) 2 – (4-6) 3 – (8-13) 4 – (15-20)

5 – (25-32) 6 – (38-43) 7 – (48-53) 8 – (>60)

0.45

0.5

0.55

0.6

0.65

Accuracy Mean Average Error

Lean DNN 1‐HOT Niu et al CNN‐POR SORD

4. Horizon Lines in the Wild: 100K images from SfM modelsWe learn the two continuous domain line parameters by interpolating100 bins from the training data, using Resnet50 and the metrics:

65

70

75

80

85

90

95AUC Score (%)

HLW SLNetSORD SORD‐EV

Raúl Díaz and Amit Marathe{raul.diaz.garcia, amit.marathe}@hp.com

Overall

Nature

AnimalUrban

People

Accuracy %Overall

Nature

AnimalUrban

People

Mean Absolute Error

RED‐SVM 1‐HOT Niu et al CNN‐POR SORD

Results on validation data

FCN Squared FCN Squared Log DeepLabv3+ Squared DeepLabv3+ Squared Log DeepLabv3+ Squared Log (Cityscapes)

6

6.5

7

7.5

8

8.5

9

absErrorRel1

1.21.41.61.82

2.22.4

sqErrorRel90

92

94

96

98

δ < 1.259

10

11

12

13

SILog

8

9

10

11

12

13

absErrorRel2

2.5

3

3.5

4

4.5

sqErrorRel12

13

14

15

16

iRMSE11

12

13

14

15

SILog

Input Image Depth Prediction Error Map

Selected References[1] H. Fu, M. Gong, C. Wang, K. Batmanghelich, and D. Tao. Deep ordinal

regression network for monocular depth estimation. In CVPR, 2018.

[2] J.-T. Lee, H.-U. Kim, C. Lee, and C.-S. Kim. Semantic line detection and its applications. In ICCV, 2017.

[3] Y. Liu, A. Wai Kin Kong, and C. Keong Goh. A constrained deep neural network for ordinal regression. In CVPR, 2018.

[4] S. Workman, M. Zhai, and N. Jacobs. Horizon lines in the wild. In BMVC, 2016.

CNN

F T

F T

F T

F T

𝑦 𝑟 ?

𝑦 𝑟 ?

𝑦 𝑟 ?

𝑦 𝑟 ?

Binary Classifiers

1

2

3

4

5

6

∙∙∙K

Soft Label

Results on the official KITTI benchmark

APMoE DABC VGG16‐Unet SORD DORN

3. Image Ranking: 5. Monocular Depth Estimation on KITTI · We learn the two continuous domain line...

Documents

Transcript of 3. Image Ranking: 5. Monocular Depth Estimation on KITTI · We learn the two continuous domain line...