1st VALSE Workshop on Pixel level image understanding
Transcript of 1st VALSE Workshop on Pixel level image understanding
![Page 1: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/1.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 1/50Ming-Ming Cheng
1st VALSE Workshop on Pixel level image understanding
http://mmcheng.net/pixelund/
VALSE 2018 · 大连
20th April
Ming-Ming Cheng
![Page 2: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/2.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 2/50Ming-Ming Cheng
Workshop Organizers
林倞(中山大学)程明明 (南开大学)
![Page 3: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/3.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 3/50Ming-Ming Cheng
Invited Speakers
刘偲 (信工所) 魏云超 (UIUC) 董超 (商汤)
王兴刚 (华科) 程明明 (南开)
![Page 4: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/4.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 4/50Ming-Ming Cheng
Learning Pixel Accurate Image Semantics from Web
Speaker: Ming-Ming Cheng
Nankai University
http://mmcheng.net/
Ming-Ming Cheng
![Page 5: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/5.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 5/50Ming-Ming Cheng
Dataset Annotation
![Page 6: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/6.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 6/50Ming-Ming Cheng
Dataset Annotation
CVML 2012, Antonio Torralba
• PASCAL 11:• 10? workers
• 27.374 bounding boxes
• ImageNet:• 25.000 workers
• 11.231.732 images labeled with one word
• ADE20K: • Prof. Torralba’s mother labeled 213.841
segmented objects
• Job offer: I am looking for more parents
![Page 7: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/7.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 7/50Ming-Ming Cheng
How do we learn ourselves?
![Page 8: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/8.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 8/50Ming-Ming Cheng
Question
• Could we get ride of user annotation process?• Even keywords level supervision would needs significant
efforts to learn new categories.
• Could a machine vision system learn from web? • Autonomous learning from web
• Without relying on any explicit user annotations
![Page 9: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/9.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 9/50Ming-Ming Cheng
Salient object detection & weak superv.
Global Contrast based Salient Region Detection, IEEE TPAMI 2015 (CVPR 2011). (2000+ citations)
![Page 10: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/10.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 10/50Ming-Ming Cheng
More category-agnostic cues?
WebSeg: Learning Semantic Segmentation from Web Searches, arXiv, 2018.
Richer Convolutional Features for Edge Detection, IEEE CVPR 2017.
Deeply supervised salient object detection with short connections, IEEE TPAMI 2018 (CVPR’17).
![Page 11: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/11.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 11/50Ming-Ming Cheng
Salient object detections (SOD)
Deeply supervised salient object detection with short connections, IEEE TPAMI 2018 (CVPR’17).
![Page 12: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/12.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 12/50Ming-Ming Cheng
Utilizing multi-scale features
![Page 13: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/13.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 13/50Ming-Ming Cheng
Bridging between multi-levels
![Page 14: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/14.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 14/50Ming-Ming Cheng
Bridging between multi-levels
![Page 15: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/15.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 15/50Ming-Ming Cheng
Sample results
![Page 16: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/16.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 16/50Ming-Ming Cheng
Sample results
![Page 17: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/17.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 17/50Ming-Ming Cheng
Sample results
![Page 18: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/18.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 18/50Ming-Ming Cheng
Messages from numbers
![Page 19: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/19.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 19/50Ming-Ming Cheng
Performance (use different dataset)
• Training on corresponding training set is the best• Especially obverse for DUT-OMRON
• More training images ≠ better performance
![Page 20: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/20.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 20/50Ming-Ming Cheng
Performance (use different dataset)
• Construct a unified, composite, and versatile dataset• Online benchmark: https://mmcheng.net/dss/
All results are obtained without any post-processing.
![Page 21: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/21.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 21/50Ming-Ming Cheng
Failure cases
![Page 22: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/22.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 22/50Ming-Ming Cheng
Sample Applications
![Page 23: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/23.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 23/50Ming-Ming Cheng
Edge detection
Richer Convolutional Features for Edge Detection, IEEE CVPR 2017.
![Page 24: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/24.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 24/50Ming-Ming Cheng
Richer Convolutional Features
![Page 25: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/25.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 25/50Ming-Ming Cheng
Explicit multi-scale still helps
![Page 26: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/26.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 26/50Ming-Ming Cheng
Samples
image G-Truth results
![Page 27: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/27.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 27/50Ming-Ming Cheng
50+ years of boundary detection
Since Roberts (1965)
![Page 28: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/28.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 28/50Ming-Ming Cheng
Category-agnostic cues…
WebSeg: Learning Semantic Segmentation from Web Searches, arXiv, 2018.
Richer Convolutional Features for Edge Detection, IEEE CVPR 2017.
Deeply supervised salient object detection with short connections, IEEE TPAMI 2018 (CVPR’17).
![Page 29: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/29.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 29/50Ming-Ming Cheng
Over-segmentation
• Challenges• Image label ≉ semantic category
• How many labels to learn?
HFS: Hierarchical Feature Selection for Efficient Image Segmentation, ECCV 2016.
DEL: Deep Embedding Learning for Efficient Image Segmentation, IJCAI 2018.
![Page 30: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/30.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 30/50Ming-Ming Cheng
Deep Embedding Learning
![Page 31: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/31.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 31/50Ming-Ming Cheng
Proxy GT from web searches
![Page 32: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/32.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 32/50Ming-Ming Cheng
Our framework
![Page 33: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/33.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 33/50Ming-Ming Cheng
Noise Filtering Module (NFM)
• Given image 𝐼, image level label 𝑦, and heuristic map 𝐻, we learn to predict binary label for each region 𝑅• Extract equal number of feature for each region
• Learn to discard potential noisy labels
![Page 34: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/34.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 34/50Ming-Ming Cheng
Learning to Filter Noisy Labels
![Page 35: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/35.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 35/50Ming-Ming Cheng
Testing phase
• NFM only used during testing
![Page 36: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/36.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 36/50Ming-Ming Cheng
Effective of using different cues
• PASCAL VOC 2012 validation set, no post-processing
![Page 37: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/37.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 37/50Ming-Ming Cheng
The role of NFM
![Page 38: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/38.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 38/50Ming-Ming Cheng
Using different training data
• 𝐷(𝑆): Simple web images, manually cleaned, 1 label
• 𝐷(𝐶): Complex images with multi image level label
• 𝐷(𝑊): Web images, 1 non-cleaned label for each image
![Page 39: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/39.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 39/50Ming-Ming Cheng
Using different training data
• 𝐷(𝑆): Simple web images, manually cleaned, 1 label
• 𝐷(𝐶): Complex images with multi image level label
• 𝐷(𝑊): Web images, 1 non-cleaned label for each image
![Page 40: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/40.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 40/50Ming-Ming Cheng
Using CRF
![Page 41: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/41.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 41/50Ming-Ming Cheng
Visual comparisons
![Page 42: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/42.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 42/50Ming-Ming Cheng
Results on validation & test set
![Page 43: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/43.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 43/50Ming-Ming Cheng
Comparisons
![Page 44: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/44.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 44/50Ming-Ming Cheng
Conclusion
• Propose an interesting/challenging vision problem• WebSeg: learning semantic segmentation from web directly
• An online noisy filtering mechanism• Let CNNs know how to discard undesired noisy regions
![Page 45: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/45.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 45/50Ming-Ming Cheng
Future works
• Never ending learning
• Effectively select good web images to learn from
• Customized salient object detection
• Improve the quality of heuristic cues
• Noise filtering mechanisms
• Other tasks using purely web supervision
We only touched the surface of purely web supervision!
![Page 46: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/46.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 46/50Ming-Ming Cheng
Source code
free
![Page 47: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/47.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 47/50Ming-Ming Cheng
Some closely related projects
FLIC: Fast Linear Iterative Clustering with Active Search, AAAI 2018.
![Page 48: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/48.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 48/50Ming-Ming Cheng
Some closely related projects
Hi-Fi: Hierarchical Feature Integration for Skeleton Detection, IJCAI 2018.
![Page 49: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/49.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 49/50Ming-Ming Cheng
Some closely related projects
S4Net: Single Stage Salient-Instance Segmentation, arXiv 2017.
![Page 50: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/50.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 50/50Ming-Ming Cheng
Some closely related projects
Three Birds One Stone: A Unified Framework for Salient Object Segmentation, Edge Detection andSkeleton Extraction, arXiv 2018.
![Page 51: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/51.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 51/50Ming-Ming Cheng
Some closely related projects
Salient Objects in Clutter: Bringing Salient Object Detection to the Foreground, arXiv 2018.
![Page 52: 1st VALSE Workshop on Pixel level image understanding](https://reader031.fdocuments.in/reader031/viewer/2022012512/618bb91462707d0b0d3d979d/html5/thumbnails/52.jpg)
1st VALSE Workshop on Pixel Level Image Understanding8:00-12:00, 20 April, VALSE 2018 52/50Ming-Ming Cheng
Thanks!Q&A