Generative adversarial networks...Generative adversarial nets. NeurIPS. Longer version Goodfellow,...
Transcript of Generative adversarial networks...Generative adversarial nets. NeurIPS. Longer version Goodfellow,...
Generative adversarial networks
Hakan Bilen
Machine Learning Practical - MLP Lecture 1323 January 2019
http://www.inf.ed.ac.uk/teaching/courses/mlp/
Slide credits: Ian Goodfellow
Generative modeling
β Density estimation
β Sample generation
2
Why are generative models useful?
β Test of our intellectual ability to use high dimensional probability distributions
β Learn from simulated data and transfer it to real world
β Complete missing data (including multi-modal)
β Realistic generation tasks (image and speech)
3
Next video frame prediction
Lotter et al 2016
4Lotter et al. Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning, ICLRβ17
Photo-realistic super-resolution
5Ledig et al. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network, CVPRβ17
Image-to-image translation
6Isola et al. Image-to-Image Translation with Conditional Adversarial Nets, CVPRβ17
Maximum likelihood
Assume that all generative models maximise the likelihood of the training data
β π₯: inputβ ππππ‘π : probability density function
of input samplesβ ππππππ: estimate probability
function, parameterised by π
7
Taxonomy of generative methods
8
Generative adversarial networks
10
generator πΊ(β ; ππΊ)
(counterfeiter)
noise
real samples Β£discriminator π·(β ; ππ·)(police) 0
1
real
fake
πΊ(π§; ππΊ)
π₯
π§ β’ π· tries to make π·(π₯) near 1 and π·(πΊ π§) near 0
β’ πΊ tries to make π·(πΊ π§) near 1
Minimax game
π ππΊ , ππ· =1
2πΌπ₯~ππππ‘πππππ· π₯; ππ· +
1
2πΌπ§~ππ§ log 1 β π· πΊ π§; ππΊ
β minππΊ
mππ₯ππ·
V ππΊ , ππ·
β π· wishes to maximise V(ππΊ , ππ·) and controls ππ·
β πΊ wishes to minimise V(ππΊ , ππ·) and controls ππΊ
β Solution to optimization at local minimum
β Game of two players with a solution at Nash equilibrium
11
Non-saturating game
π½π· = β1
2πΌπ₯~ππππ‘πππππ· π₯; ππ· β
1
2πΌπ§~ππ§ log 1 β π· πΊ π§;ππΊ
π½πΊ =1
2πΌπ§~ππ§ log 1 β π· πΊ π§;ππΊ
β Problem: when π· successfully rejects generator samples, generatorβs gradient vanishes
β Solution:
π½πΊ = β1
2πΌπ§~ππ§ log π· πΊ π§; ππΊ
12
Training GANs
13
generator πΊ(β ; ππΊ)
noise
real samples Β£discriminator π·(β ; ππ·)
πΊ(π§; ππΊ)
π₯
π§
ππ½π·
ππ·
ππ½πΊ
ππ·
ππ½πΊ
ππ·
ππ·
ππΊ π½πΊ = β1
2πΌπ§~ππ§ log π· πΊ π§; ππΊ
ππ½π·
ππ·
ππ·
πππ·
ππ½πΊ
ππ·
ππ·
ππΊ
ππΊ
πππΊ
ππ½πΊ
ππ·
π½π·
π½πΊ
π½π· = β1
2πΌπ₯~ππππ‘πππππ· π₯; ππ· β
1
2πΌπ§~ππ§ log 1β π· πΊ π§; ππΊ
Transpose convolution
Stride=1
Review: Transposed convolution (deconv)
Convolution
Stride=1
14Dumoulin and Vision (2016), A guide to convolution arithmetic for deep learning, arXiv
Transpose convolution
Stride=2
Review: Transposed convolution (deconv)
Convolution
Stride=2
15Dumoulin and Vision (2016), A guide to convolution arithmetic for deep learning, arXiv
DCGAN (generator) architecture
β Deconv weights are learned (unlike Zeiler & Fergus, see lecture 12)β Deconvs are batch normalizedβ No fully connected layerβ Adam optimiser
16Radford et al, (2015). Unsupervised representation learning with deep convolutional generative adversarial networks.
DCGAN for LSUN bedrooms
High quality images on restricted domain
17Radford et al (2015). Unsupervised representation learning with deep convolutional generative adversarial networks.
Vector arithmetic on face samples
Semantically meaningful latent space
18Radford et al (2015). Unsupervised representation learning with deep convolutional generative adversarial networks.
Problems with counting and global structure
19Salimans et al, S. (2016). Improved Techniques for Training GANs. NeurIPS.
Mode collapse
- Another failure mode is for the generator to collapse to a single mode- If the generator learns to render only one realistic object, this can be
enough to fool the discriminator
20Salimans et al, S. (2016). Improved Techniques for Training GANs. NeurIPS.
Measuring GAN performance
Ask humans to distinguish between generated data and real dataβ Subjective, requires training to spot flaws
Automating the evaluation (Inception score):1. Image quality: images should contain meaningful objects (few dominant objects)
- Feed images to Inception Net to obtain conditional label distribution π π¦ π₯- For each generated image, entropy over classes Οπ¦ π π¦ π₯ πππ π π¦ π₯ should be low
2. Diversity: the model should generate varied images (balanced over classes)- Entropy over generated images π π¦ π₯ = πΊ π§ ππ§ should be high
β Combining two requirements (βhow different is the score distribution for a generated image from the overall class balance?β)
πΈπ₯πΎπΏ(π(π¦|π₯)||π(π¦))
21Salimans et al, S. (2016). Improved Techniques for Training GANs. NeurIPS.
Class-conditional GANs
β Add class information in latent code π§β It is not unsupervised anymore
22Mirza and Osindero (2014). Conditional Generative Adversarial Nets. arXiv.
generator πΊ(β ; ππΊ)
noise
0
1
0
One-hot class label
Spectral normalization
β GAN training can be unstableβ Culprit: Gradient updates for G are
unboundedβ Observation: First layer weights of G
are ill-behaved when the training is unstable
β Solution: Apply spectral norm
β Spectral norm is the square root of the maximum eigenvalue of πππ
β π π = πππ₯πβ 2
β 2β π/π(π)
23Miyato et al. (2018). Spectral Normalization for GANs. ICLR.
Baseline
Scaling up GANs
Brock et al show thatβ bigger batch sizesβ more number of filter channels
(~50% more)β larger dataset (292 million
images with 8.5k images, 512 TPU cores!)
improves state-of-the-art around 35% in Inception Score
24Brock et al. (2019). Large Scale GAN Training for High Fidelity Natural Image Synthesis. ICLR.
2014 to 2019
25Brock et al. (2019).
Goodfellow et al. (2014).
Summary
β’ Generative models
β’ Training GANs
β’ DCGAN Architecture
β’ (Open) challenges
β’ Improvements
26
Reading material
Recommendedβ Goodfellow et al, (2014). Generative adversarial nets. NeurIPS. β Longer version Goodfellow, (2016). NIPS 2016 Tutorial: Generative Adversarial Networks.
Extraβ Radford et al, S. (2015). Unsupervised representation learning with deep convolutional
generative adversarial networks. β Salimans et al, S. (2016). Improved Techniques for Training GANs. NeurIPS.β Miyato et al. (2018). Spectral Normalization for GANs. ICLR.β Brock et al. (2019). Large Scale GAN Training for High Fidelity Natural Image Synthesis. ICLR.
27