New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to...
Transcript of New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to...
![Page 1: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/1.jpg)
New (Optimization) Perspectives on GANs
Gauthier Gidel
![Page 2: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/2.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
I. A Variational Inequality Perspective on GANs.
II. Reducing Noise in GANs with Variance Reduced Methods.
![Page 3: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/3.jpg)
A Variational Inequality Perspective on GANs
Gauthier Gidel*¹, Hugo Berard*¹², Gaëtan Vignoud¹, Pascal Vincent¹², Simon Lacoste-Julien¹
*equal contribution¹ Mila, Université de Montréal
² Facebook AI Research (FAIR), Montréal
![Page 4: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/4.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Hugo Berard
GaëtanVignoud
PascalVincent
SimonLacoste-Julien
![Page 5: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/5.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
1. Quick Recap on GANs and two-player games.
2. GAN as a Variational Inequality Problem.
3. Optimization of Variational Inequality.
4. Experimental results.
5. Conclusion.
NB: All the citations in this talk are in my arXiv submission.
![Page 6: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/6.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Quick recap on Generative Adversarial Networks (GANs)
(and two-player games)
![Page 7: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/7.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Generative Adversarial Networks (GANs)Fake Data
True Data
GeneratorNoise
DiscriminatorFakeorReal
[Goodfelow et al. NIPS 2014]
![Page 8: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/8.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Generative Adversarial Networks (GANs)
Discriminator Generator
If D is non-parametric:
[Goodfelow et al. NIPS 2014]
Non-saturating GAN: “much stronger gradient in early learning”Loss of Generator Loss of Discriminator
![Page 9: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/9.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Two-player Games
Zero-sum game if: also called Saddle Point (SP).
Example: WGAN formulation [Arjovsky et al. 2017]
Player 2Player 1
![Page 10: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/10.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Two-player GamesPlayer 2Player 1
● In games we want to converge to the Saddle Point.
● Different from single objective minimization where
we want to avoid saddle points.
● Saddle point -> Zero-sum game (or Minmax)
![Page 11: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/11.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Two-player Games
Non zero-sum game if we do not have:
Player 2Player 1
Example: Non-saturating GAN: [Goodfellow et al. 2014]
Loss of Generator Loss of Discriminator
![Page 12: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/12.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Minmax training is hard different !
![Page 13: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/13.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Minmax training is hard different !
(You can replace “minmax” with two-player games)
![Page 14: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/14.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
“Minmax Training is Hard ...”Example: WGAN with linear discriminator and generator
Bilinear saddle point = Linear in 𝜃 and 𝜙 ⇒ “Cycling behavior” (see right).
Gradient vector field:
![Page 15: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/15.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Generative Adversarial Networks as a Variational Inequality Problem
(VIP)
![Page 16: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/16.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
GANs as a Variational Inequality
Nash-Equilibrium:
Stationary Conditions:
No player can improve its cost
New perspective for GANs:- Based on stationary conditions.- Relates to vast literature with standard algorithms.
can be constraint sets.
![Page 17: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/17.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
GANs as a Variational Inequality
Nash-Equilibrium: Stationary Conditions:
Same problem but different perspective.
Joint Minimization vs. Stationary point
![Page 18: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/18.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
GANs as a Variational InequalityStationary Conditions:
Can be written as:
𝜔* solves the Variational Inequality
![Page 19: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/19.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
GANs as a Variational InequalityStationary Conditions:
Figure from [Dunn 1979]
Unconstrained (or optimum in the interior):
![Page 20: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/20.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
GANs as a Variational InequalityStationary Conditions:
Unconstrained (or ⍵* in the interior): Constrained and ⍵* on the boundary:
Figure from [Dunn 1979]
![Page 21: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/21.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
GANs as a Variational InequalityTakeaways:
- GAN can be formulated as a Variational Inequality.
- Encompass most of GANs formulations.
- Standard algorithms from Variational Inequality can be used for GANs.
- Theoretical Guarantees (for convex and stochastic cost functions).
![Page 22: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/22.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Techniques to optimize VIP (Batch setting)
![Page 23: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/23.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Standard Algorithms from Variational InequalityMethod 1: Averaging - Converge even for “cycling behavior”.
- Easy to implement. (out of the training loop)- Can be combined with any method.
Averaging schemes can be efficiently implemented in an online fashion:
![Page 24: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/24.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Standard Algorithms from Variational InequalityMethod 1: Averaging - Converge even for “cycling behavior”.
- Easy to implement. (out of the training loop)- Can be combined with any method.
General Online averaging:
Example 1: Uniform averaging
Example 2: Exponential moving averaging (EMA)
![Page 25: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/25.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Standard Algorithms from Variational InequalityMethod 1: Averaging - Converge even for “cycling behavior”.
- Easy to implement. (out of the training loop)- Can be combined with any method.
General Online averaging:
Example 1: Uniform averaging
Example 2: Exponential moving averaging (EMA)
![Page 26: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/26.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Standard Algorithms from Variational InequalityMethod 1: Averaging - Converge even for “cycling behavior”.
- Easy to implement. (out of the training loop)- Can be combined with any method.
General Online averaging:
Example 1: Uniform averaging
Example 2: Exponential moving averaging (EMA)
![Page 27: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/27.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Standard Algorithms from Variational InequalityMethod 1: Averaging - Converge even for “cycling behavior”.
- Easy to implement. (out of the training loop)- Can be combined with any method.
General Online averaging:
Example 1: Uniform averaging
Example 2: Exponential moving averaging (EMA)
![Page 28: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/28.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Standard Algorithms from Variational InequalityMethod 1: Averaging
Simple Minmax problem:
![Page 29: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/29.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Standard Algorithms from Variational InequalityMethod 1: Averaging
Simple Minmax problem:
![Page 30: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/30.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Standard Algorithms from Variational InequalityMethod 1: Averaging Simultaneous Vs. Alternating more developed in
Negative Momentum for Improved Game DynamicsGidel, Askari Hemmat, Pezeshki, Lepriol, Huang, Lacoste-Julien and Mitliagkas
![Page 31: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/31.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Standard Algorithms from Variational InequalityMethod 2: Extragradient
- Step 1:
- Step 2:
Intuition:
1. Game prespective: Look one step in the future and anticipate next move of adversary.
2. Euler’s method: Extrapolation is close to an implicit method because
- Standard in the literature.- Does not require averaging.- Theoretically and empirically
faster.
![Page 32: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/32.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Standard Algorithms from Variational InequalityMethod 2: Extragradient
Intuition: Extrapolation is close to an implicit method because
Unknown:Require to solve a non-linear system
![Page 33: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/33.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Standard Algorithms from Variational InequalityMethod 2: Extragradient Intuition: Extrapolation is close to an implicit method
*
*
almost the same
![Page 34: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/34.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Problem: Extragradient requires to compute two gradients at each step.
Solution: Extrapolation from the past Re-use gradient.
- Step 1: Re-use from previous iteration.
- Step 2: (same as extragradient).
Extrapolation from the past: Re-using the gradients
![Page 35: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/35.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Extrapolation from the past: Re-using the gradients
Problem: Extragradient requires to compute two gradients at each step.
Solution: Extrapolation from the past Re-use gradient.
- Step 1: Re-use from previous iteration.
- Step 2: (same as extragradient).
New Method !!!Related to [Daskalakis et al., 2018]
![Page 36: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/36.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
step-size = 0.2step-size = 0.5
![Page 37: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/37.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Experimental Results
![Page 38: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/38.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Experimental Results
Bilinear Stochastic Objective: (with constraints)
![Page 39: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/39.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Extrapolation(Adam style)
Update(Adam style)
![Page 40: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/40.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Experimental Results: WGAN on CIFAR10Inception Score on CIFAR10
Extragradient Methods
Inception Score vs nb of generator updates
Averaging
![Page 41: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/41.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Experimental Results: WGAN-GP (ResNet) on CIFAR10
Extragradient Methods Averaging
Inception Score vs Number of
![Page 42: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/42.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
To sum-up
- GAN can be formulated as a Variational Inequality.
- Bring standard methods from optimization literature to the GAN community.
- Averaging helps improve the inception score (further evidence by [Yazici et al. 2018]).
- Extrapolation is faster and achieve better convergence.
- Introduce Extrapolation from the past a cheaper version of extragradient.
- We can design better algorithm for GANs inspired from Variational Inequality.
![Page 43: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/43.jpg)
Noise in GANs
![Page 44: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/44.jpg)
Reducing Noise in GAN Training with Variance Reduced Extragradient
Tatjana Chavdarova*¹², Gauthier Gidel*¹, François Fleuret¹², Simon Lacoste-Julien¹
*equal contribution¹ Mila, Université de Montréal
² EPFL, IDIAP
![Page 45: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/45.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Tatjana Chavdarova
François Fleuret
SimonLacoste-Julien
![Page 46: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/46.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Reminder: Need for Averaging or/and Extragradient.
![Page 47: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/47.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Reminder: Need for Averaging or/and Extragradient.
No signal from the average iterate.
The green sequence do not stop at the optimum.
We need last iterate convergence.(Not Convergence of the averaged iterate)
Focus on Extragradient.
![Page 48: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/48.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Issue: We did not consider noise.
Minimization Game
![Page 49: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/49.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Issue: We did not consider noise.
Far from the objective: “approximately” the right direction
Far from the objective:Direction with noise can be “bad”.
![Page 50: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/50.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Standard methods to solve (bilinear) games:
Gradient method Extragradient
Batch Method Diverge to ∞
Stochastic Method No hope for convergence ????
![Page 51: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/51.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Noise breaks Extragradient.
![Page 52: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/52.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Noise breaks Extragradient.Intuition:
Extragradient Updates:
![Page 53: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/53.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Noise breaks Extragradient.Intuition:
Extragradient Updates:(Sample i and j)
Extrapolation part
Ai Aj = 0 No extrapolation
Diverge as GD.
![Page 54: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/54.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Reducing noise with Variance reduction methods.
- Idea: take advantage of the finite sum.
- Finite sum in ML: Expectation of a finite number of sample.
- Generator and discriminator losses can be written as:
![Page 55: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/55.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
SVRG estimate of the gradient.
- Full batch gradient expensive but tractable.
![Page 56: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/56.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
SVRG estimate of the gradient.
- Full batch gradient expensive but tractable.
Snapshot network
![Page 57: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/57.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
SVRG estimate of the gradient.
- Full batch gradient expensive but tractable.
Snapshot network
Full gradient at thesnapshot network
![Page 58: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/58.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
SVRG estimate of the gradient.
- Full batch gradient expensive but tractable.
- Unbiased estimates:
Snapshot network
Full gradient at thesnapshot network
![Page 59: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/59.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
SVRG estimate of the gradient.
- Full batch gradient expensive but tractable.
- Unbiased estimates:
- Compute the snapshot only once per pass.
Snapshot network
Full gradient at thesnapshot network
![Page 60: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/60.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Variance Reduced Extragradient: SVRE
- Combine Extragradient + Variance Reduction for finite sum.
![Page 61: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/61.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Variance Reduction of Strongly Monotone Games:
SVRG and Acc. SVRG are from [Palaniapan and Bach 2016]
![Page 62: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/62.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Why is this convergence rate not desirable ?
Vs.
Does not handle Unconstrained case.No restart possible.
Does handle Unconstrained case.Restart possible.
![Page 63: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/63.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
SVRE on bilinear Game: (Exact example where stochastic extragradient breaks)
![Page 64: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/64.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
First point, SVRE effectively reduces the variance:
Blue: Stochastic Extragradient
Brown: SVRE.
![Page 65: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/65.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Second point SVRE allows larger step-sizes: (SVHN)
SE: Stochastic Extragradient.
SVRE: Variance Reduced Extragradient.
-A: Adam
WS: Warm Start.
AVG: Average.
-VRAd (VRam): variant of Adam for SVRE.
![Page 66: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/66.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
Second point SVRE allows larger step-sizes: (ImageNet)
![Page 67: New (Optimization) Perspectives on GANs · - Bring standard methods from optimization literature to the GAN community. - Averaging helps improve the inception score (further evidence](https://reader033.fdocuments.in/reader033/viewer/2022043019/5f3b2ad2ef1724745f4ad254/html5/thumbnails/67.jpg)
Gauthier Gidel, MSR Seminar, January 29, 2019
To sum-up
- Noise may be an issue in GANs.
- Proposed to combine VR + Extragradient to tackle both game and noise aspects.
- Unlike in single-objective minimization, we observed that variance reduction could improve the performance of deep learning models for GAN training.
- highlights the difference between game optimization and standard minimization.