GANs for Exploiting Unlabeled Dataweb.eng.tau.ac.il/deep_learn/wp-content/uploads/... · Learning...
Transcript of GANs for Exploiting Unlabeled Dataweb.eng.tau.ac.il/deep_learn/wp-content/uploads/... · Learning...
GANs for Exploiting Unlabeled Data
Improved Techniques for Training GANs
Learning from Simulated and Unsupervised Images through Adversarial Training
Presented by:Uriya Pesso
Nimrod Gilboa Markevich
“[…] you could not see an article in the press [about AI] without the picture being Terminator. It was always Terminator, 100 percent. And you see less of that now, and that’s a good thing “
Yann LeCun, Director, Facebook AI
“Generative Adversarial Networks is the most interesting idea in the last ten years in machine learning.”Yann LeCun, Director, Facebook AI
Presentation Overviewo Motivation
o Intro to GANs
o Paper 1: Semi Supervised Learning
o Paper 2: Simulated and Unsupervised Learning
o Conclusion
Motivation –Compensating for Missing Datao Labeled data is expensive (time, money, effort)
o Unlabeled data is cheaper
o Unlabeled data still contains information
o How can we utilize unlabeled data?
o GANs!◦ We will present two methods from two papers
Generative Adversarial Networkso What does it do?
◦ Generate synthetic data that is indistinguishable from real data
o Uses◦ Super-Resolution
◦ Text-to-Speech
◦ Art
◦ Exploiting unlabeled data
(for training other networks)
Generative Adversarial Networkso Adversarial Networks – Opponent Networks
◦ Generator – Create realistic samples
◦ Discriminator – Distinguish generated from real
o Compete with each other
o The only labels are Real/Fake
o Gradient Descent, Train in turns
Train D
Train G
Loss Functions
o Cross-entropy loss function
o – probability of x being real
o – generated sample from noise z
o – network parameters
o – discriminator loss function
o – Generator loss function
data~ ~noise, log log 1
D D G
pL D D G x zx z
G DL L
D x
G z
,
D G
DL
GL
Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, Xi Chen
Improved Techniques for Training GANs
Semi-Supervised Learning - Ideao Objective: We want to train a classifier
o We have:◦ labeled data => supervised learning
◦ unlabeled data => unsupervised learning
o Combining labeled and unlabeled data => semi-supervised
Semi-Supervised Learning - Ideao Combine GAN and classifier networks.
o Add a label for the synthetic data - K+1
1,.., , 1y K K
DClassifier
x or x*
yG(z)G
Generatorz
1,..,y K
Classifierx y
Generated,Realy
DDiscriminator
x*
yG(z)G
Generatorz
Supervised
Unsupervised
Semi-Supervised
Semi-Supervised Loss Functiono Cross entropy loss over two distributions
data, ~ ( , ) modellog |y p yL p y x x x
datasupervised , ~ ( , ) modellog | , 1p yL p y y K x y x x
dataunsupervised ~ ( ) model ~ modellog 1 1| log 1|p GL p y K p y K x x xx x
◦ - is the probability distribution of the input data.
◦ - is the model probability distribution to predict label y from K labels given data X.
data ,p yx
modellog |p y x
~ modellog 1|G p y K x x supervised unsupervisedL L
Supervised Loss
o Where:◦ - is the probability distribution of the input data.
◦ - is the model probability distribution to predict label y from K labels given data X.
1,.., , 1y K K
Classifierx y
datap x
datasuprvised , ~ ( , ) modellog | , 1y p yL p y y K x x x
modellog |p y x
Supervised Loss
o - is the probability distribution of the input data.
o - is the model probability distribution to predict label y from K labels given data X.
1,..,y K
Classifierx y
datap x
datasuprvised , ~ ( , ) modellog |p yL p y x y x x
modellog |p y x
Unsupervised Loss
o – the probability distribution to predict that the data x is unreal.
dataunsuprvised ~ ( ) model ~ ( ) modellog(1 ( 1| )) log( ( 1| ))p GL p y K p y K x x x zx x
Generated,Realy
DDiscriminator
x
y
G(z)GGenerator
z
data~ ~noise, log log 1
D D G
pL D D G x zx z
model( ) 1 ( 1| )D p y K x x
model ( 1| )p y K x
Punchline
data
data
supervised unsupervised
supervised , ~ ( , ) model
unsupervised ~ ( ) model ~ model
log | , 1
log 1 1| log 1|
p y
p G
L L L
L p y y K
L p y K p y K
x y x
x x x
x
x x
Classifier
GAN
Semi-Supervised Learning – Intuitiono How does a classifier benefit from unlabeled data?
o From the unlabeled data, the classifier learns to focus on the right features, thus reducing generalization error
Number Not Number
Results
Results - MNIST
Generated Real
Results - MNIST
(*) number of labeled samples per class the rest are unlabeledtrain size: 60,000test size: 10,000
(*)
Results – CIFAR10
CIFAR10 Generated
Results – Imagenet
DCGAN(Not “Ours”)
Results – Imagenet
“Our”Method
Avish Shrivastava, Tomas Pfister, Oncel Tuzel, Josh Susskind, Wenda Wang, Russ WebbApple Inc
Learning from Simulated and Unsupervised Images through Adversarial Training
CVPR 2017 Best Paper Award
Motivationo Labeled data is expensive (time, money, effort)
o Simulated data is cheap
o Alternatively, we could learn using simulated data
Simulator
Data Set of
LabeledSyntheticImages
NN
Motivationo Drawback: Simulated data may have artifacts
o We want more realistic synthetic data
o We have unlabeled data
o Let’s build a refiner
Refiner Network / S+U Learning
Simulator
Data Set of
LabeledSyntheticImages
SimGAN Architecture
SimGAN Architecture
Loss Function – Discriminatoro Adversarial Loss
real simulated~ ~
, log log 1D D R
p pL D D R
x x x xx x
o – probability of x being real
o – refined simulated image
o – network parameters
o – discriminator loss function
D x
R x
,
D R
DL
o – pdf of real images
o – pdf of simulated images
realp x
simulatedp x
Loss Function – Refinero Refiner has two goals
◦ Generate realistic samples – Adversarial Loss
◦ Preserve the label – Self-regularization
simulatedreal reg~
,R D R
pL x x
real log D R x reg 1R x x
o – refined simulated image
o – network parameters
o – refiner loss function
R x o – pdf of simulated images simulatedp x
,
D R
RL o – weight
Minor Improvementso Are we done?
o Almost. There are two additional mechanisms:◦ Local Adversarial Loss
◦ Using a History of Refined Images
Local Adversarial Loss
o Discriminator outputs wxh probability map
o The adversarial loss is the sum of the loss over the local patches
o Localization restrains artifacts – the refined image should look real in every patch
Local Adversarial Loss
Using a History of Refined Imageso Problem
◦ Discriminator only focuses on the latest refined images
◦ Refiner might reintroduce artifacts that the discriminator has forgotten
o Solution◦ Buffer refined images
Using a History of Refined Images
Using a History of Refined Images
Results
Stages for SimGAN Performance Evaluation
1) Train SimGAN
2) Generate Synthetic Refined Dataset => Qualitative Results
3) Train NN (Estimator) with the Dataset
4) Test NN (Estimator) => Quantitative Results
Results - Performance Evaluation using Gaze Estimatioro Simulated images – UnityEyes, 1.2M
o Real Labeled Dataset – MPIIGaze Dataset, 214K
Gaze Estimator
Process – 1. Train SimGAN
UnityEyes
MPIIGaze (without labels)
Data Set of Labeled Synthetic Images
Process – 2. Trained Refiner Generates DB
SimGANRefiner
Data Set of Labeled RefinedSynthetic Images
Qualitative Results
Process – 3. Train Gaze Estimator
Data Set of Labeled SyntheticRefined Images
Gaze Estimator
Data Set of Real Images
Process – 4. Test Gaze Estimator
Gaze Estimator
CNN
Output
Quantitative Results
Results – Performance Evaluation using Hand Pose Estimatioro NYU hand pose dataset, 73,000 training, 8,000 testing
Hand Pose
Estimator
Selected Points CoordinatesDepth Image
Qualitative Result
Quantitative Results
Quantitative Results
Conclusiono Generative Adversarial Networks are awesome
◦ Generator vs. Discriminator
o Unlabeled data can be used for supervised learning◦ Semi-Supervised Learning
◦ Classifier combined with Discriminator
◦ Train GAN with labeled and unlabeled data
◦ Simulated and Unsupervised Learning◦ Train Refiner
◦ Generate large synthetic refined dataset