Deep Representation for Imbalanced Classiﬁcation...

Large-scale CelebA face attribute dataset

• 200K celebrity images, each with 40 attributes

• Highly imbalanced: average positive class rate 23%

• Total accuracy =𝑡𝑝 + 𝑡𝑛

𝑁𝑝 + 𝑁𝑛Balanced accuracy =

𝑡𝑝

𝑁𝑝+

𝑡𝑛

𝑁𝑛

Edge detection on BSDS500 dataset

• Retrieve from 2M edge label patches with long-tail distribution

Learning Deep Representation for Imbalanced ClassificationChen Huang1,2, Yining Li1, Chen Change Loy1, Xiaoou Tang1

1The Chinese University of Hong Kong 2SenseTime Group Limited

{chuang, ly015, ccloy, xtang}@ie.cuhk.edu.hk

Data imbalance is common in visual classification

1. Motivation

Wearing

Not wearing

Minority class

Majority class

Face attribute example

Deep embedding: Class-level cluster- & class-level constraint

Study traditional re-sampling and cost-sensitive learning scheme

2. Main Idea

Triplet embedding

Class 1

minority

Class 2

majority

Class 2 majority

Quintuplet embedding

Class 1 minority

Cluster j

Cluster 2

Cluster 1 Cluster 1

𝐷 𝑓 𝑥𝑖 , 𝑓 𝑥𝑖𝑝+

< 𝐷(𝑓 𝑥𝑖 , 𝑓(𝑥𝑖𝑝−))

< 𝐷(𝑓 𝑥𝑖 , 𝑓(𝑥𝑖𝑝−−

)) < 𝐷(𝑓 𝑥𝑖 , 𝑓(𝑥𝑖𝑛))

𝑥𝑖 – an anchor

𝑥𝑖𝑝+

– the anchor’s most distant within-cluster neighbor

𝑥𝑖𝑝−

– the nearest within-class neighbor of the anchor, but from a different cluster

𝑥𝑖𝑝−−

– the most distant within-class neighbor of the anchor

𝑥𝑖𝑛 – the nearest between-class neighbor of the anchor

Triple-header hinge loss

Network architecture

• Equal class re-sampling & class costs assignment in batches

Training step

3. Large Margin Local Embedding (LMLE)

● Clustering by k-means

● Generate quintuplets from

cluster & class membership

● Re-sample batches equally from each class

● Forward their quintuplets to CNN to

compute loss

● Back-propagation

Feature-based clustering

Feature learning/updating

Every 5000 iterations

batches

Training

samples …

EmbeddingQuintuplet

Shared parameters

Large margin cluster-wise kNN: fast & imbalance resistant

4. Cluster-wise kNN search

-8 -6 -4 -2 0 2 4 6 8-5

-15 -10 -5 0 5 10-20

-10 -8 -6 -4 -2 0 2 4 6 8-8

-8 -6 -4 -2 0 2 4 6 8-5

-15 -10 -5 0 5 10-20

-10 -8 -6 -4 -2 0 2 4 6 8-8

-8 -6 -4 -2 0 2 4 6 8-5

-15 -10 -5 0 5 10-20

-10 -8 -6 -4 -2 0 2 4 6 8-8

-8 -6 -4 -2 0 2 4 6 8-5

-15 -10 -5 0 5 10-20

-10 -8 -6 -4 -2 0 2 4 6 8-8

Class 1: cluster 1

Class 1: cluster 2

Class 2: cluster 1

Class 2: cluster 2

Class 2: cluster 3

Class 2: cluster 4

Class 2: cluster 5

-8 -6 -4 -2 0 2 4 6 8-5

-15 -10 -5 0 5 10-20

-10 -8 -6 -4 -2 0 2 4 6 8-8

-8 -6 -4 -2 0 2 4 6 8-5

-15 -10 -5 0 5 10-20

-10 -8 -6 -4 -2 0 2 4 6 8-8

DeepID2 Triplet Our LMLE

5. Results

Total acc. Balanced acc.

Triplet-kNN 83 72

Anet 87 80

LMLE-kNN 90 84

Ground truthSketchToken

ODS 0.73

DeepContour

ODS 0.76

LMLE-kNN

ODS 0.78

6. Conclusion

Cluster- & class-level quintuplets preserve both locality

across clusters and discrimination between classes,

irrespective of class imbalance

Large margin classification by fast cluster-wise kNN search

Deep Representation for Imbalanced Classiﬁcation...

Documents

Transcript of Deep Representation for Imbalanced Classiﬁcation...

Towards incremental learning of nonstationary imbalanced data … · ensembleapproach(REA)inthispaper.Tobattleagainstthe imbalanced learning problem in training data chunk received

Imbalanced classiﬁcation: a paradigm-based review

Imbalanced Data Set Learning with Synthetic Examples

Online Active Learning with Imbalanced Classes

Tutorial on Learning Class Imbalanced Data Streams

Imbalanced social-communicative and restricted repetitive ......ARTICLE Imbalanced social-communicative and restricted repetitive behavior subtypes of autism spectrum disorder exhibit

An insight into classification with imbalanced data ... › sites › default › files › files › ... · An insight into classiﬁcation with imbalanced data: Empirical results

imbalanced data

Model 915 Indicator Service Manual - Amazon S3...11: 4" 200K x 10 200K x 20 200K x 50 200K x 100 200K x 200 200K x 500 12: CC-20 20K x 1 20K x 2 20K x 5 200K x 10 200K x 20 200K x

in November 2005 was about 200K

An insight into imbalanced Big Data classification ...150.214.190.154/sites/...CAIS-BigData-Imbalanced.pdf · Keywords Big Data · Imbalanced classiﬁcation · MapReduce · Pre-processing

Efficient Imbalanced Multimedia Concept Retrieval by Deep ...

Imbalanced text classiﬁcation: A term weighting approachccc.inaoep.mx/~villasen/bib/Imbalanced text classification- A term weighting...Imbalanced text classiﬁcation: A term weighting

Deep Imbalanced Attribute Classification using Visual ...openaccess.thecvf.com/content_ECCV_2018/papers/Nikolaos_Sarafi… · Learning from imbalanced data is a well-studied problem

Unsupervised Domain Adaptation with Imbalanced Cross ...

An Imbalanced World

FOUNDATIONS OF IMBALANCED LEARNINGgweiss/papers/foundations-imbalanced-13.pdfscribes the fundamental issues associated with learning from imbalanced data. This description provides

Addressing imbalanced classification with instance ... · Addressing imbalanced classiﬁcation with instance generation techniques: IPADE-ID Victoria Lópeza,n, Isaac Trigueroa,

Classiﬁcation of Imbalanced Marketing Data with Balanced ...clopinet.com/isabelle/Projects/KDDcup09/Papers/Nikulin-paper5.pdf · Classiﬁcation of Imbalanced Marketing Data with

Code Smell Detection and Identification in Imbalanced ...