Generative Adversarial Networks - Machine Learningmachinelearning.math.rs/Jordanski-GAN.pdf•Text...

23
Generative Adversarial Networks Miloš Jordanski

Transcript of Generative Adversarial Networks - Machine Learningmachinelearning.math.rs/Jordanski-GAN.pdf•Text...

Page 1: Generative Adversarial Networks - Machine Learningmachinelearning.math.rs/Jordanski-GAN.pdf•Text -to-Image Synthesis •Learn useful embeddings •... Generative AdversarialNetworks

Generative AdversarialNetworks

Miloš Jordanski

Page 2: Generative Adversarial Networks - Machine Learningmachinelearning.math.rs/Jordanski-GAN.pdf•Text -to-Image Synthesis •Learn useful embeddings •... Generative AdversarialNetworks

Generative models

• Training data 𝐷 = 𝑥1, … , 𝑥𝑁 ~𝑝𝑑𝑎𝑡𝑎(𝑥)

• Result: distribution 𝑝𝑚𝑜𝑑𝑒𝑙(𝑥) ≅ 𝑝𝑑𝑎𝑡𝑎(𝑥)

Page 3: Generative Adversarial Networks - Machine Learningmachinelearning.math.rs/Jordanski-GAN.pdf•Text -to-Image Synthesis •Learn useful embeddings •... Generative AdversarialNetworks

Generative models

• 𝑝𝑚𝑜𝑑𝑒𝑙(𝑥) directly

• Generate samples from 𝑝𝑚𝑜𝑑𝑒𝑙(𝑥)

Page 4: Generative Adversarial Networks - Machine Learningmachinelearning.math.rs/Jordanski-GAN.pdf•Text -to-Image Synthesis •Learn useful embeddings •... Generative AdversarialNetworks

Applications

• Additional training data• Missing values• Semi-supervised learning• Reinforcement learning• Multiple correct answers• Text-to-Image Synthesis• Learn useful embeddings• ...

Page 5: Generative Adversarial Networks - Machine Learningmachinelearning.math.rs/Jordanski-GAN.pdf•Text -to-Image Synthesis •Learn useful embeddings •... Generative AdversarialNetworks

Generative Adversarial Networks

Page 6: Generative Adversarial Networks - Machine Learningmachinelearning.math.rs/Jordanski-GAN.pdf•Text -to-Image Synthesis •Learn useful embeddings •... Generative AdversarialNetworks

Models with latent space

• x - observed variables• z – latent variables

Page 7: Generative Adversarial Networks - Machine Learningmachinelearning.math.rs/Jordanski-GAN.pdf•Text -to-Image Synthesis •Learn useful embeddings •... Generative AdversarialNetworks

DCGAN

Page 8: Generative Adversarial Networks - Machine Learningmachinelearning.math.rs/Jordanski-GAN.pdf•Text -to-Image Synthesis •Learn useful embeddings •... Generative AdversarialNetworks

GAN - architecture

Page 9: Generative Adversarial Networks - Machine Learningmachinelearning.math.rs/Jordanski-GAN.pdf•Text -to-Image Synthesis •Learn useful embeddings •... Generative AdversarialNetworks

GAN - discriminator

𝐽(𝐷) 𝜃 𝐷 , 𝜃 𝐺 = −12𝐸𝑥~𝑝𝑑𝑎𝑡𝑎𝑙𝑜𝑔𝐷(𝑥) −

12𝐸𝑧log(1 − 𝐷 𝐺(𝑧) )

• Optimal discriminator:

𝐷∗ 𝑥 =𝑝𝑑𝑎𝑡𝑎(𝑥)

𝑝𝑚𝑜𝑑𝑒𝑙(𝑥) + 𝑝𝑑𝑎𝑡𝑎(𝑥)

Page 10: Generative Adversarial Networks - Machine Learningmachinelearning.math.rs/Jordanski-GAN.pdf•Text -to-Image Synthesis •Learn useful embeddings •... Generative AdversarialNetworks

GAN – Choice of divergence

• By varying 𝜃 we can make 𝑃𝜃 distribution arbitrarily “close” to 𝑃𝑑𝑎𝑡𝑎• How to define “close”?

Page 11: Generative Adversarial Networks - Machine Learningmachinelearning.math.rs/Jordanski-GAN.pdf•Text -to-Image Synthesis •Learn useful embeddings •... Generative AdversarialNetworks

GAN – Choice of divergence

• Kullback-Leibler divergence: 𝐾𝐿(ℙ| ℚ = 𝔼𝑥~ℙ[𝑙𝑜𝑔

ℙ(𝑥)ℚ(𝑥)

]

• Jensen Shannon divergence:

𝐽𝑆(ℙ| ℚ =12𝐾𝐿(ℙ ||

12ℙ + ℚ ) +

12𝐾𝐿(ℚ||

12(ℙ + ℚ))

Page 12: Generative Adversarial Networks - Machine Learningmachinelearning.math.rs/Jordanski-GAN.pdf•Text -to-Image Synthesis •Learn useful embeddings •... Generative AdversarialNetworks

GAN – Choice of divergence

𝐾𝐿(ℙ| ℚ = ℙ(𝑥)𝑙𝑜𝑔 ℙ(𝑥)ℚ(𝑥)

𝐾𝐿(ℚ| ℙ = ℚ(𝑥)𝑙𝑜𝑔 ℚ(𝑥)ℙ(𝑥)

Page 13: Generative Adversarial Networks - Machine Learningmachinelearning.math.rs/Jordanski-GAN.pdf•Text -to-Image Synthesis •Learn useful embeddings •... Generative AdversarialNetworks

Zero-sum game

• N players• Payoff matrix

• Nash equilibrium of a game

Page 14: Generative Adversarial Networks - Machine Learningmachinelearning.math.rs/Jordanski-GAN.pdf•Text -to-Image Synthesis •Learn useful embeddings •... Generative AdversarialNetworks

Generator cost function

𝐽(𝐺) 𝜃 𝐷 , 𝜃 𝐺 = −𝐽(𝐷) 𝜃 𝐷 , 𝜃 𝐺

• Minimax game:𝑉(𝜃 𝐷 , 𝜃 𝐺 ) = −𝐽(𝐷) 𝜃 𝐷 , 𝜃 𝐺

• Solution:𝜃 𝐷 ∗ = 𝑎𝑟𝑔𝑚𝑖𝑛max

𝜃 𝐺 𝜃 𝐷𝑉(𝜃 𝐷 , 𝜃 𝐺 )

In practice:

𝐽(𝐺) 𝜃 𝐷 , 𝜃 𝐺 = − 12𝐸𝑧log(𝐷 𝐺(𝑧) )

Page 15: Generative Adversarial Networks - Machine Learningmachinelearning.math.rs/Jordanski-GAN.pdf•Text -to-Image Synthesis •Learn useful embeddings •... Generative AdversarialNetworks

Generator Cost function

• Minimax: 𝐽(𝐺)= 12𝐸𝑥~𝑝𝑑𝑎𝑡𝑎𝑙𝑜𝑔𝐷 𝑥 + 1

2𝐸𝑧log(1 − 𝐷 𝐺(𝑧) )

• Heuristic: 𝐽(𝐺) = − 12𝐸𝑧log(𝐷 𝐺(𝑧) )

• MLE: 𝐽(𝐺) = − 12𝐸𝑧𝑒(𝜎

−1(𝐷(𝐺(𝑧))))

Page 16: Generative Adversarial Networks - Machine Learningmachinelearning.math.rs/Jordanski-GAN.pdf•Text -to-Image Synthesis •Learn useful embeddings •... Generative AdversarialNetworks

GAN - Training

• Simultaneous stochastic gradient descent

• Discriminator updates 𝜃(𝐷) to reduce 𝐽(𝐷)

• Generator updates 𝜃(𝐺) to reduce 𝐽(𝐺)

• 𝐽(𝐺) does not make reference to the training data directly

Page 17: Generative Adversarial Networks - Machine Learningmachinelearning.math.rs/Jordanski-GAN.pdf•Text -to-Image Synthesis •Learn useful embeddings •... Generative AdversarialNetworks

GAN - Tricks

• 𝑥 ∈ [−1, 1]

• Using labels

• One-side label smoothing

• Batch normalization

• Running more steps of one player

Page 18: Generative Adversarial Networks - Machine Learningmachinelearning.math.rs/Jordanski-GAN.pdf•Text -to-Image Synthesis •Learn useful embeddings •... Generative AdversarialNetworks

Generative Adversarial Networks

• Advantages:

• Sample generation in a single step (does not depend on dimensionality of X)

• Does not use Markov chain

• Does not use variational inference

• Generation function has very few restrictions

• Disadvantages:

• Hard to train

Page 19: Generative Adversarial Networks - Machine Learningmachinelearning.math.rs/Jordanski-GAN.pdf•Text -to-Image Synthesis •Learn useful embeddings •... Generative AdversarialNetworks

GAN - results

Page 20: Generative Adversarial Networks - Machine Learningmachinelearning.math.rs/Jordanski-GAN.pdf•Text -to-Image Synthesis •Learn useful embeddings •... Generative AdversarialNetworks

Semi-supervised learning with GAN

Page 21: Generative Adversarial Networks - Machine Learningmachinelearning.math.rs/Jordanski-GAN.pdf•Text -to-Image Synthesis •Learn useful embeddings •... Generative AdversarialNetworks

GAN – Latent Space Arithmetic

Page 22: Generative Adversarial Networks - Machine Learningmachinelearning.math.rs/Jordanski-GAN.pdf•Text -to-Image Synthesis •Learn useful embeddings •... Generative AdversarialNetworks

GAN – another perspective

• Cooperative instead of adversarial

• Learn cost function

• Reinforcement Learning

• Actor-Critic

• …

Page 23: Generative Adversarial Networks - Machine Learningmachinelearning.math.rs/Jordanski-GAN.pdf•Text -to-Image Synthesis •Learn useful embeddings •... Generative AdversarialNetworks

THANKS!