Generative Models - Sharif University of...

104
Generative Models M. Soleymani Sharif University of Technology Fall 2017 Slides are based on Fei Fei Li and colleagues lectures, cs231n, Stanford 2017.

Transcript of Generative Models - Sharif University of...

Page 1: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Generative Models

M. Soleymani

Sharif University of Technology

Fall 2017

Slides are based on Fei Fei Li and colleagues lectures, cs231n, Stanford 2017.

Page 2: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Supervised Learning

• Data: (x, y)– x is data

– y is label

• Goal: Learn a function to map x -> y

• Examples: Classification, regression, object detection, semanticsegmentation, image captioning, etc.

Page 3: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Unsupervised Learning

• Data: x– Just data, no labels!

• Goal: Learn some underlying hiddenstructure of the data

• Examples: Clustering, dimensionalityreduction, feature learning, densityestimation, etc.

Page 4: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Unsupervised Learning

• Data: x– Just data, no labels!

• Goal: Learn some underlying hiddenstructure of the data

• Examples: Clustering, dimensionalityreduction, feature learning, densityestimation, etc.

Page 5: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Unsupervised Learning

• Data: x– Just data, no labels!

• Goal: Learn some underlying hiddenstructure of the data

• Examples: Clustering, dimensionalityreduction, feature learning, densityestimation, etc.

Page 6: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Autoencoders

• Unsupervised approach for learning a lower-dimensional featurerepresentation from unlabeled training data

– Train such that features can be used to reconstruct original data• “Autoencoding” - encoding itself

Page 7: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Autoencoders

Page 8: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Autoencoders

• Unsupervised approach for learning a lower-dimensional featurerepresentation from unlabeled training data

– Train such that features can be used to reconstruct original data• “Autoencoding” - encoding itself

Page 9: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Autoencoders

• Unsupervised approach for learning a lower-dimensional featurerepresentation from unlabeled training data

– Train such that features can be used to reconstruct original data• “Autoencoding” - encoding itself

Page 10: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Autoencoders

Page 11: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Autoencoders

Page 12: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Autoencoders

• Pre-train the encoder with (unlabeled) data

Page 13: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Autoencoders

Page 14: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Unsupervised Learning

• Data: x– Just data, no labels!

• Goal: Learn some underlying hiddenstructure of the data

• Examples: Clustering, dimensionalityreduction, feature learning, densityestimation, etc.

Page 15: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Generative Models

Page 16: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Generative Models

• Task: Given a dataset of images {X1,X2...} can we learn thedistribution of X?

• Typically generative models implies modelling P(X).– Very limited, given an input the model outputs a probability

• More Interested in models which we can sample from distribution ofdata.

– producing examples that are similar to the training examples but not exactlythe same.• synthesize new, unseen examples similar to the seen ones

Page 17: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Taxonomy of Generative Models

Page 18: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Why Generative Models?

• Realistic samples for artwork, super-resolution, colorization, etc.

• Generative models of time-series data can be used for simulation andplanning (reinforcement learning applications!)

• Training generative models can also enable inference of latentrepresentations that can be useful as general features

Page 19: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Fully visible belief network

Page 20: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Chain rule:

Oord et al., Pixel recurrent neural networks, ICML 2016

Intuition

Page 21: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Fully visible belief network

Page 22: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

PixelRNN

• Generate image pixels starting from corner

• Dependency on previous pixels modeled using an RNN (LSTM)

Oord et al., Pixel recurrent neural networks, ICML 2016

Page 23: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

PixelRNN

• Generate image pixels starting from corner

• Dependency on previous pixels modeled using an RNN (LSTM)

Oord et al., Pixel recurrent neural networks, ICML 2016

Page 24: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

PixelRNN

• Generate image pixels starting from corner

• Dependency on previous pixels modeled using an RNN (LSTM)

Oord et al., Pixel recurrent neural networks, ICML 2016

Page 25: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

PixelRNN

• Generate image pixels starting from corner

• Dependency on previous pixels modeled using an RNN (LSTM)

• Drawback: sequential generation is slow!

Oord et al., Pixel recurrent neural networks, ICML 2016

Page 26: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

PixelCNN

• Still generate image pixels starting from corner

• Dependency on previous pixels now modeled using a CNN overcontext region

Page 27: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

PixelCNN

• Still generate image pixels starting from corner

• Dependency on previous pixels now modeledusing a CNN over context region

• Training: maximize likelihood of training images

Softmax loss at each pixel

Page 28: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

PixelCNN

• Still generate image pixels starting from corner

• Dependency on previous pixels now modeledusing a CNN over context region

• Training is faster than PixelRNN– can parallelize convolutions since context region

values known from training images

• Generation must still proceed sequentially =>still slow

Softmax loss at each pixel

Page 29: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Generated Samples

Page 30: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

PixelRNN and PixelCNN

Page 31: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

So far

Page 32: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Variational Autoencoders

• Probabilistic spin on autoencoders– will let us sample from the model to generate data!

Page 33: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Variational Autoencoders

Page 34: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Variational Autoencoders

Learn model parameters to maximize likelihood of training data

Page 35: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Variational Autoencoders: Intractability

Page 36: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Variational Autoencoders: Intractability

Page 37: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Variational Autoencoders: Intractability

Page 38: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Variational Autoencoders: Intractability

Page 39: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Variational Autoencoders: Intractability

Page 40: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Variational Autoencoders: Intractability

Cannot optimize likelihood directly, derive and optimize lower bound on likelihood instead

Page 41: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Variational Autoencoders: Intractability

Page 42: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Lower bound for log likelihood

log 𝑝 𝑥 = 𝐸𝑧~𝑝 𝑧 log 𝑝 𝑥

Page 43: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Lower bound for log likelihood

log 𝑝(𝑥) = 𝐸𝑧~𝑞 𝑧|𝑥 log 𝑝(𝑥) = 𝐸𝑧~𝑞 𝑧|𝑥 log𝑝 𝑥|𝑧 𝑝(𝑧)

𝑝(𝑧|𝑥)

= 𝐸𝑧~𝑞 𝑧|𝑥 log𝑝 𝑥|𝑧 𝑝 𝑧

𝑝 𝑧|𝑥×𝑞 𝑧|𝑥

𝑞 𝑧|𝑥

Bayes rule

Page 44: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Lower bound for log likelihood

log 𝑝(𝑥) = 𝐸𝑧~𝑞 𝑧|𝑥 log 𝑝(𝑥) = 𝐸𝑧~𝑞 𝑧|𝑥 log𝑝 𝑥|𝑧 𝑝(𝑧)

𝑝(𝑧|𝑥)

= 𝐸𝑧~𝑞 𝑧|𝑥 log𝑝 𝑥|𝑧 𝑝 𝑧

𝑝 𝑧|𝑥×𝑞 𝑧|𝑥

𝑞 𝑧|𝑥

= 𝐸𝑧~𝑞 𝑧|𝑥 log 𝑝 𝑥|𝑧 + 𝐸𝑧~𝑞 𝑧|𝑥 log𝑞 𝑧|𝑥

𝑝 𝑧|𝑥

− 𝐸𝑧~𝑞 𝑧|𝑥 log𝑞 𝑧|𝑥

𝑝 𝑧

Page 45: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Variational Autoencoders

Page 46: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Variational Autoencoders

Page 47: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Variational Autoencoders

Page 48: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Variational Autoencoders

Page 49: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Variational Autoencoders

Page 50: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Variational Autoencoders

Page 51: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Maximizing ELBO

• Maximizing ELBO composed of:– Approximate posterior distribution q(z|x): Best match to true posterior p(z|x)

– Reconstruction cost: The expected log-likelihood measures how well samplesfrom q(z|x) are able to explain the data x.

– Penalty: Ensures that the explanation of the data q(z|x) doesn’t deviate toofar from your beliefs p(z).

Page 52: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Encoder network

• Model q(z|x) with a neural network (called encoder network)

• Assume q(z|x) to be Gaussian

• Neural network outputs the mean 𝜇𝑧|𝑥 and covariance matrix Σ𝑧|𝑥

Page 53: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Decoder network

• Model P(x|z) with a neural network (called decoder network)– let f(z) be the network output.

– Assume p(x|z) to be i.i.d. Gaussian, e.g.,• x = f(z) + η , where η ~ N(0,I)

Page 54: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Variational Autoencoders

Page 55: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Variational Autoencoders

Page 56: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Variational Autoencoders

Page 57: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Cost function

If we consider:𝑝 𝑥 𝑧 ∼ 𝑁 𝑓 𝑧 , 𝜎2𝐼

We have:

log 𝑝 𝑥 𝑧 = 𝐶 −1

2𝜎2𝑥 − 𝑓 𝑧 2

• Putting it all together (given a (x, z) pair):

𝐿 =1

2𝜎2𝑥 − 𝑓 𝑧 2 + 𝐷[𝑞(𝑧)| 𝑝 𝑧

Corresponding to 𝐸𝑧~𝑞 𝑧 𝑥 log 𝑝 𝑥 𝑧

term since 𝑧~𝑞 𝑧 𝑥 in the training data

Page 58: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Reparametrization Trick

z ~ N(μ, σ) is equivalent toμ + σ ⋅ ε, where ε ~ N(0, 1)

Now we can easily backpropagate the loss to the Encoder.Non-continuous operation and has no gradient

Page 59: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Generate Image: At Test Time

• Remove the Encoder

• Sample z ~ N(0,I) and pass it through the Decoder.

• No good quantitative metric, relies on visual inspection

Page 60: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Variational Autoencoders: Generating Data!

Page 61: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Variational Autoencoders: Generating Data!

Page 62: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Variational Autoencoders: Generating Data!

Page 63: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Variational Autoencoders: Generating Data!

Page 64: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Common VAE Architecture for Image Generation

Page 65: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Variational Autoencoders: Generating Data!

Page 66: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Conditional VAE (CVAE)

• Replace all p(x|z) with p(y|z,x)

• Replace all q(z|x) with q(z|x,y).

Page 67: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

CVAE

Go through the same KL divergence procedure, to get the same lowerbound.

Lower bound of log conditional likelihood:log 𝑝 𝑦|𝑥 − 𝐷𝐾𝐿 𝑞 𝑧|𝑦, 𝑥 ||𝑝 𝑧|𝑦, 𝑥= 𝐸𝑧~𝑞 𝑧|𝑥,𝑦 log 𝑝 𝑦|𝑥, 𝑧 − 𝐷𝐾𝐿 𝑞 𝑧|𝑦, 𝑥 ||𝑝 𝑧

Page 68: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Attribute2Image

Xinchen Yan et al., Attribute2Image: Conditional Image Generation from Visual Attributes, 2015.

Page 69: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Variational Autoencoders

• Probabilistic spin to traditional autoencoders => allows generating data

• Defines an intractable density => derive and optimize a (variational) lower bound

• Pros:– Principled approach to generative models– Allows inference of q(z|x), can be useful feature representation for other tasks

• Cons:– Maximizes lower bound of likelihood: okay, but not as good evaluation as PixelRNN/PixelCNN– Samples blurrier and lower quality compared to state-of-the-art (GANs)

• Active areas of research:– More flexible approximations, e.g. richer approximate posterior instead of diagonal Gaussian– Incorporating structure in latent variables

Page 70: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

So far

Page 71: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Generative Adversarial Networks

• Problem: Want to sample from complex, high-dimensional training distribution.

– No direct way to do this!

• Solution:– Sample from a simple distribution, e.g. random noise.

– Then, learn transformation to training distribution.

• Q: What can we use to represent this complextransformation?

– A neural network!

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014.

Page 72: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Training GANs: Two-player game

• Generator network: try to fool the discriminator by generating real-looking images

• Discriminator network: try to distinguish between real and fakeimages

Page 73: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Training GANs: Two-player game

• Generator network: try to fool the discriminator by generating real-looking images

• Discriminator network: try to distinguish between real and fakeimages

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014.

Page 74: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Training GANs: Two-player game

• Generator network: try to fool the discriminator by generating real-looking images

• Discriminator network: try to distinguish between real and fakeimages

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014.

Page 75: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Training GANs: Two-player game

• Generator network: try to fool the discriminator by generating real-looking images

• Discriminator network: try to distinguish between real and fakeimages

Discriminator (θd ) wants to maximize objective such that D(x) is close to 1 (real) and D(G(z)) is close to 0 (fake)

Generator (θg ) wants to minimize objective such that D(G(z)) is close to 1 (discriminator is fooled into thinking generated G(z) is real)

Page 76: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Training GANs: Two-player game

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014.

Page 77: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Training GANs: Two-player game

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014.

Page 78: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Training GANs: Two-player game

Aside: Jointly training two networks is challenging, can be unstable. Choosing objectives with better loss landscapes helps training, is an active area of research.

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014.

Page 79: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Putting it together: GAN training algorithm

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014.

Page 80: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Putting it together: GAN training algorithm

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014.

Page 81: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

GAN: Example Generation

Page 82: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

GAN: Example Generation

Page 83: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

GAN: Example Generation

Page 84: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Generative Adversarial Nets: Convolutional Architectures

• Generator is an upsampling network with fractionally-stridedconvolutions Discriminator is a convolutional network

Page 85: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

DCGAN: Convolutional Architectures

Page 86: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Generative Adversarial Nets: Convolutional Architectures

Page 87: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Generative Adversarial Nets: Convolutional Architectures

Page 88: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Generative Adversarial Nets: Interpretable Vector Math

Page 89: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Generative Adversarial Nets: Interpretable Vector Math

Page 90: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Generative Adversarial Nets: Interpretable Vector Math

Page 91: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Generative Adversarial Nets: Interpretable Vector Math

Page 92: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Generative Adversarial Nets: Interpretable Vector Math

Page 93: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Generative Adversarial Nets: Interpretable Vector Math

Page 94: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Generative Adversarial Nets: Interpretable Vector Math

Page 95: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Conditional GAN

M. Mirza, S. Osindero, Conditional Generative Adversarial Nets, 2014.

Page 96: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Conditional GAN

M. Mirza, S. Osindero, Conditional Generative Adversarial Nets, 2014.

Page 97: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Energy-Based GAN (EBGAN)

• EBGAN architecture with an auto-encoder discriminator

• auto-encoders have the ability to learn an energy manifold withoutsupervision or negative examples

J. Zhao et al., Energy-based Generative Adversarial Network, 2016.

Page 98: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Combining VAE with GAN

Anders Boesen Lindbo Larse et al., Autoencoding beyond pixels using a learned similarity metric, ICML 2016.

Page 99: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative
Page 100: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Generative Adversarial Text to Image Synthesis

S. Reed et al., Generative Adversarial Text to Image Synthesis, 2016.

Page 101: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative
Page 102: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

GANs

• Don’t work with an explicit density function• Take game-theoretic approach: learn to generate from training distribution

through 2-player game

• Pros:– Beautiful, state-of-the-art samples!

• Cons:– Trickier / more unstable to train– Can’t solve inference queries such as p(x), p(z|x)

• Active areas of research:– Better loss functions, more stable training (Wasserstein GAN, LSGAN, many others)– Conditional GANs, GANs for all kinds of applications

Page 103: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Recap

Page 104: Generative Models - Sharif University of Technologyce.sharif.edu/courses/96-97/1/ce959-1/resources/root...Generative Adversarial Text to Image Synthesis S. Reed et al., Generative

Resources

• Kingma and Welling, “Auto-Encoding Variational Bayes”, ICLR 2014.

• Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014.

• Ian Goodfellow, “Generative Adversarial Nets”, NIPS 2016 Tutorial.