Fully Convolutional Networks for Semantic Segmentationjhoffman/yahooJapan... · 2016. 3. 25. ·...
Transcript of Fully Convolutional Networks for Semantic Segmentationjhoffman/yahooJapan... · 2016. 3. 25. ·...
UC Berkeley
Fully Convolutional Networksfor Semantic Segmentation
Jonathan Long* Evan Shelhamer* Trevor Darrell1
- what kind of thingis each pixel part of?
- what kind of stuffis each pixel?
Challenges- tension between
recognition and localization- amount of computation
Semantic Segmentation
2
person
horse
car
Segmentation: PASCAL VOC
3
per-son
horse
car
deep learning with Caffe
end-to-end networks lead to50% relative improvement or 30 points absolute and >100x speedup in 1 year!
FCN:pixelwise convnet
state-of-the-art, in Caffe
Leaderboard
4
“tabby cat”
1000-dim vector
~1 millisecond
convnets perform classification
end-to-end learning
5
~100 ms
end-to-end learning
???
convnets perform segmentation?
“tabby cat”
6
a classification network
7
becoming fully convolutional
8
becoming fully convolutional
9
upsampling output
conv, pool,nonlinearity
upsampling
pixelwiseoutput + loss
end-to-end, pixels-to-pixels network
10
resultsFCN SDS* Truth Input
11*Simultaneous Detection and Segmentation Hariharan et al. ECCV14
Relative to prior state-of-the-art SDS:
- 30% relative improvementin accuracy(67.2% on VOC 2012)
- 286× faster
spectrum of deep features
combine where (local, shallow) with what (global, deep)
fuse features into deep jet
(cf. Hariharan et al. CVPR15 “hypercolumn”) 12
stride 32
no skips
stride 16
1 skip
stride 8
2 skips
ground truthinput image
skip layer refinement
13
[ comparison credit: CRF as RNN, Zheng* & Jayasumana* et al. ICCV 2015 ]
14DeepLab: Chen* & Papandreou* et al. ICLR 2015. CRF-RNN: Zheng* & Jayasumana* et al. ICCV 2015
graphical model refinement
nets for many pixelwise tasks
semanticsegmentation
15
monocular depth estimation (Eigen & Fergus 2015)
boundary prediction (Xie & Tu 2015)optical flow Fischer et al. 2015
fcn.berkeleyvision.org
conclusionfully convolutional networks are fast, end-to-end models for pixelwise problems
- code in Caffe master- models for PASCAL VOC, NYUDv2,
SIFT Flow, PASCAL-Context
16
caffe.berkeleyvision.org
github.com/BVLC/caffe
model exampleinference examplesolving example