Large-Scale Object Recognition with Weak Supervision

Weiqiang Ren, Chong Wang, Yanhua Cheng,

Kaiqi Huang, Tieniu Tan

{wqren,cwang,yhcheng,kqhuang,tnt}@nlpr.ia.ac.cn

Task2 : Classification + Localization

Task 2b: Classification + localization with additional training data— Ordered by classification error

1. Only classification labels are used

2. Full image as object location

Outline

• Motivation

• Method

• Results

Motivation

Knowing where to look, recognizing objects will be easier !

However, in the classification-only task, no annotations of object location are available.

Weakly Supervised Localization

Why Weakly Supervised Localization (WSL)?

Current WSL Results on VOC07

13.915.0

22.422.7

26.226.4

31.633.7

13.9: Weakly supervised object detector learning with model drift detection, ICCV 2011

15.0: Object-centric spatial pooling for image classification, ECCV 2012

22.4: Multi-fold mil training for weakly supervised object localization, CVPR 2014

22.7: On learning to localize objects with minimal supervision, ICML 2014

26.4: Weakly supervised object detection with posterior regularization, BMVC 2014

31.6: Weakly supervised object localization with latent category learning, ECCV 2014 Sep 11, Poster Session 4A, #34

26.2: Discovering Visual Objects in Large-scale Image Datasets with Weak Supervision, submitted to TPAMI

VOC 2007 Results

Ours 31.6

DPM 5.0 33.7

Weakly Supervised Object Localization with Latent Category Learning

ECCV 2014

VOC 2007 Results

Ours 26.2

DPM 5.0 33.7

Discovering Visual Objects in Large-scale Image Datasets with Weak Supervision

Submitted to TPAMI

Our Work

For the consideration of high efficiency in large-scale tasks, we use the second one.

Method

Framework

Conv Layers

FC Layers

Input Images

Cls Prediction

Det Prediction

Rescoring

1st : CNN Architecture

Chatfield et al. Return of the Devil in the Details: Delving Deep into Convolutional Nets

2nd: MILinear SVM

Good region proposal algorithmsHigh recallHigh overlapSmall numberLow computation cost

MCG pretrained on VOC 2012Additional DataTraining: 128 windows/ imageTesting: 256 windows/imageCompared to Selective Search (~2000)

MILinear : Region Proposal

• Low Level Features– SIFT, LBP, HOG– Shape context, Gabor, …

• Mid-Level Features– Bag of Visual Words (BoVW)

• Deep Hierarchical Features– Convolutional Networks– Deep Auto-Encoders– Deep Belief Nets

MILinear: Feature Representations

• Clustering– KMeans

• Topic Model– pLSA, LDA, gLDA

• CRF• Multiple Instance Learning– DD, EMDD, APR– MI-NN,– MI-SVM, mi-SVM– MILBoost

MILinear: Positive Window Mining

• Multiple instance Linear SVM

• Optimization: trust region Newton– A kind of Quasi Newton method– Working in the primal– Faster convergence

MILinear: Objective Function and Optimization

MILinear: Optimization Efficiency

3rd: Detection Rescoring• Rescoring with softmax

……

1000 classes

trainsoftmax

…1000 dim 1000 dim

Softmax: consider all the categories simultaneously at each minibatch of the optimization – Suppress the response of other appearance similar object categories

4th: Classification Rescoring

• Linear Combination

(1 )Scls cls WSLS S

…1000 dim

One funny thing: We have tried some other strategies of score combination, but it seems not working !

Results

1st: Classification without WSL

Method Top 5 Error

Baseline with one CNN : 13.7

Average with four CNNs: 12.5

2nd: MILinear on ImageNet 2014Methods Detection Error

Baseline (Full Image) 61.96

MILinear 40.96

Winner 25.3

2nd: MILinear on VOC 2007

2nd: MILinear on ILSVRC 2013 detection

mAP: 9.63%! vs 8.99% (DPM5.0)

2nd: MILinear for ClassificationMethods Top 5 Error

Milinear 17.1

3rd: WSL Rescoring (Softmax)Method Top 5 Error

Baseline with one CNN : 13.7

Average with four CNN : 12.5

MILinear 17.1

MILinear + Rescore 13.5

The Softmax based rescoring successfully suppresses the predictions of other appearance similar object categories !

4th: Cls and WSL Combinataion(1 )Scls cls WSLS S

Method Top 5 Error

Baseline with one CNN model: 13.7

Average with four CNN models: 12.5

MILinear 17.1

MILinear + Rescore 13.5

Cls (12.5) + MILinear (13.5) 11.5

WSL and Cls can be complementary to each other!

Russakovsky et al. ImageNet Large Scale Visual Object Challenge.

Conclusion

• WSL always helps classification

• WSL has large potential: WSL data is cheap

Thank You!

Large-Scale Object Recognition with Weak Supervision

Documents

Transcript of Large-Scale Object Recognition with Weak Supervision

Large-Scale Object Recognition with Weak Supervision Weiqiang Ren, Chong Wang, Yanhua Cheng, Kaiqi Huang, Tieniu Tan {wqren,cwang,yhcheng,kqhuang,tnt}@nlpr.ia.ac.cn.

Educational Supervision Agreement - Wales Deanery...The Educational Supervision Agreement underpins the Wales Deanerys approach to the recognition of Educational Supervisors. In order

Learning under Distributed Weak Supervision

Learning Object Detectors with Weak Supervision

Snuba: Automating Weak Supervision to Label Training Data · Snuba: Automating Weak Supervision to Label Training Data Paroma Varma Stanford University paroma@stanford.edu Christopher

Action Recognition From Weak Alignment of Body …vgg/publications/2014/... · HOAI, LADICKÝ, ZISSERMAN: ACTION FROM WEAK ALIGNMENT OF BODY PARTS 1 Action Recognition From Weak Alignment

Constrained Deep Weak Supervision for Histopathology Image ...

Early Detection of Fake News with Multi-source Weak Social …skai2/files/ecml_pkdd_mwss.pdf · 2020-07-09 · Early Detection of Fake News with Multi-source Weak Social Supervision

Strong Supervision from Weak Annotation: Interactive

Erasing Scene Text with Weak Supervision

Weak Supervision for Fake News Detection via Reinforcement ...yaqingwa/files/20/AAAI20.pdf · Weak Supervision for Fake News Detection via Reinforcement Learning Yaqing Wang,1 Weifeng

Cap2Det: Learning to Amplify Weak Caption Supervision for … · 2019. 7. 25. · Cap2Det: Learning to Amplify Weak Caption Supervision for Object Detection Keren Ye1, Mingda Zhang1,

Gaussian Process Density Counting from Weak Supervision · Gaussian Process Density Counting from Weak Supervision 3 { A fast inference algorithm that makes our model usable for pixel-level

Supervision, Performance Assessment and Recognition ...

IMG Supervision Explained Webinar. About GPSA GPSA is the national representative body that unites GP Supervisors by promoting recognition for supervision.

RESEARCH SUPERVISION RECOGNITION PROGRAMME · RESEARCH SUPERVISION RECOGNITION PROGRAMME The Voice of the Postgraduate Community 6 INTRODUCTION Good-practice has been defined by Brown

Multi-Source Weak Supervision for Saliency Detectionopenaccess.thecvf.com/.../papers/Zeng_Multi-Source...CVPR_2019_paper.pdf · Multi-source weak supervision for saliency detection

Action Recognition From Weak Alignment of Body Parts

Abstract - arXiv2016), NSM achieves new state-of-the-art results with weak supervision, signiﬁcantly closing the gap between weak and full supervision for this task. Unlike prior

Relation Extraction with Weak Supervision and …...By using a model which provides a better approximation of the weak source of supervision, it outperforms the state-of-the-art methods.