Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization...

70
Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua Bengio Membre : Ioannis Mitliagkas Directeur : Simon Lacoste-Julien

Transcript of Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization...

Page 1: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Efficient Saddle Point Optimization for Modern Machine Learning

Prédoc III - Gauthier Gidel

Jury: Président : Yoshua Bengio

Membre : Ioannis MitliagkasDirecteur : Simon Lacoste-Julien

Page 2: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Outline

1. Introduction on Saddle point optimization, Games and Variational Inequalities.

2. Frank-Wolfe Algorithm for Saddle Point problems.

3. Negative Momentum for improved game dynamics.

4. A Variational inequality perspective on GANs.

5. Future Work.

NB: All the citations in this talk are at the end of the slides.

Slides available on my website: http://gauthiergidel.github.io

Page 3: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Saddle point optimization, Games and Variational Inequalities.

Based on [Gidel et al. 2017], [Gidel et al. 2018a] and [Gidel et al. 2018b]

Page 4: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Game dynamics are weirdfascinating

Page 5: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Start with optimization dynamics

Page 6: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Optimization

Smooth, differentiable cost function, L→ Looking for stationary (fixed) points

(gradient is 0)→ Gradient descent

Gauthier Gidel, Predoc III , November 28, 2018

Page 7: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Optimization

Conservative vector field →

Gradient based dynamics

Gauthier Gidel, Predoc III , November 28, 2018

Page 8: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Saddle point problems

Gauthier Gidel, Predoc III , November 28, 2018

Smooth, differentiable cost function,→ Looking for stationary (fixed) points

(gradients are 0)→ Gradient descent method.

Page 9: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Non-Conservative vector field →

Gradient based dynamics:

Gauthier Gidel, Predoc III , November 28, 2018

Saddle point problems

Page 10: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Minmax training is hard different !

Page 11: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Minmax training is hard different !(You can replace “minmax” with two-player games)

Page 12: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

“Minmax Training is Hard ...”

Example: WGAN [Arjovky et al. 2017] with linear discriminator and generator

Bilinear saddle point = Linear in 𝜃 and 𝜙 ⇒ “Cycling behavior” (see right).

Dynamics:

Gauthier Gidel, Predoc III , November 28, 2018

Page 13: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Multi-player Games

Page 14: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Two-player Games

Zero-sum game if: also called Saddle Point (SP).

Example: WGAN formulation [Arjovsky et al. 2017]

Player 2Player 1

Gauthier Gidel, Predoc III , November 28, 2018

Page 15: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Two-player Games

Non zero-sum game if we do not have:

Player 2Player 1

Example: Non-saturating GAN: [Goodfellow et al. 2014]

Loss of Generator Loss of Discriminator

Gauthier Gidel, Predoc III , November 28, 2018

Page 16: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Two-player GamesPlayer 2Player 1

● In games we want to converge to the Saddle Point.

● Different from single objective minimization where

we want to avoid saddle points.

● Saddle point -> Zero-sum game (or Minmax)

Gauthier Gidel, Predoc III , November 28, 2018

Page 17: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Variational Inequality Problem (VIP)

Page 18: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Variational Inequality Problem

Nash-Equilibrium:

Stationary Conditions:

No player can improve its cost

- Based on stationary conditions.- Relates to vast literature with standard algorithms.

can be constraint sets.

Gauthier Gidel, Predoc III , November 28, 2018

Page 19: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Variational Inequality Problems

Nash-Equilibrium: Stationary Conditions:

Same problem but different perspective.

Joint Minimization vs. Stationary point

Gauthier Gidel, Predoc III , November 28, 2018

(under convexity assumption)

Page 20: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Variational Inequality Problem

Stationary Conditions:

Can be written as:

𝜔* solves the Variational Inequality

Gauthier Gidel, Predoc III , November 28, 2018

Page 21: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Variational Inequality ProblemStationary Conditions:

Figure from [Dunn 1979]

Unconstrained (or optimum in the interior):

Gauthier Gidel, Predoc III , November 28, 2018

Page 22: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Variational Inequality ProblemStationary Conditions:

Figure from [Dunn 1979]

Unconstrained (or ⍵* in the interior): Constrained and ⍵* on the boundary:

Figure from [Dunn 1979]

Gauthier Gidel, Predoc III , November 28, 2018

Page 23: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Techniques to optimize VIP (Batch setting)

Page 24: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Standard Algorithms from Variational InequalityMethod 1: Averaging - Converge even for “cycling behavior”.

- Easy to implement. (out of the training loop)- Can be combined with any method.

Averaging schemes can be efficiently implemented in an online fashion:

Gauthier Gidel, Predoc III , November 28, 2018

Page 25: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Standard Algorithms from Variational InequalityMethod 1: Averaging - Converge even for “cycling behavior”.

- Easy to implement. (out of the training loop)- Can be combined with any method.

General Online averaging:

Example 1: Uniform averaging

Example 2: Exponential moving averaging (EMA)

Gauthier Gidel, Predoc III , November 28, 2018

Page 26: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Standard Algorithms from Variational InequalityMethod 2: Extragradient

- Step 1:

- Step 2:

Intuition:

1. Game prespective: Look one step in the future and anticipate next move of adversary.

- Standard in the literature.- Does not require averaging.- Theoretically and empirically

faster.

Gauthier Gidel, Predoc III , November 28, 2018

Page 27: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Frank-Wolfe Algorithm for Saddle Point Problems

Based on an AISTATS paper [Gidel et al. 2017]. Joint work with Tony Jebara and Simon Lacoste-Julien

Page 28: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Saddle point problems

Gauthier Gidel, Predoc III , November 28, 2018

Smooth, differentiable cost function,→ Compact constraints sets.→ Looking for stationary (fixed) points→ Gradient descent method.

Page 29: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Saddle point problems

Gauthier Gidel, Predoc III , November 28, 2018

Smooth, differentiable cost function,→ Compact constraints sets.→ Looking for stationary (fixed) points→ Gradient descent method.

Need to project ?

Page 30: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Projection-free Method

Projection may be challenging.

Figure from [Dunn 1979]

Gauthier Gidel, Predoc III , November 28, 2018

(Extra-)Gradient method:- Require Projection- Each projection is a quadratic problem

- Might be too expensive if the constraints set is structured.

- May use instead projection-free methods.- Frank-Wolfe is projection-free.- It only requires to solve linear problem.

2

Page 31: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Projection-free Method

Gauthier Gidel, Predoc III , November 28, 2018

(Extra-)Gradient method:- Require Projection- Each projection is a quadratic problem

- Might be too expensive if the constraints set is structured.

- May use instead projection-free methods.

- Frank-Wolfe is projection-free.- It only requires to solve linear problem.

Example of problem with expensive projection:

Page 32: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Projection-free Method

Gauthier Gidel, Predoc III , November 28, 2018

Page 33: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Projection-free Method

Gauthier Gidel, Predoc III , November 28, 2018

Page 34: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Projection-free Method

Gauthier Gidel, Predoc III , November 28, 2018

Page 35: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Projection-free Method for Saddle Point

Gauthier Gidel, Predoc III , November 28, 2018

Page 36: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Theoretical Contributions

Gauthier Gidel, Predoc III , November 28, 2018

[Hammond 1984]

Page 37: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Negative Momentum for Improved Game Dynamics

Based on an AISTATS submission [Gidel et al. 2018b]. Joint work with Reyhane Askari Hemmat, Mohammad Pezeshki, Rémi Le Priol, Gabriel Huang, Simon

Lacoste-Julien and Ioannis Mitliagkas

Page 38: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Two-player Games

Nash Equilibrium Smooth, differentiable L→ Looking for local Nash equil.

→ Gradient method: → Simultaneous→ Alternating

Gauthier Gidel, Predoc III , November 28, 2018

Page 39: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Two-player Games

Gauthier Gidel, Predoc III , November 28, 2018

Simultaneous Updates:

Alternating Updates:

Page 40: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

First contribution: Bilinear game

Page 41: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

“Proof by picture”Gradient descent

→ Simultaneous→ Alternating

Momentum→ Positive→ Negative

Page 42: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Second contribution: Game dynamics

Gauthier Gidel, Predoc III , November 28, 2018

Page 43: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Game dynamics under gradient descent

Jacobian is non-symmetric, with complex eigenvalues → Rotations in decision space

Momentum can manipulate the eigenvalues of the Jacobian.

Can momentum help/hurt??

Gauthier Gidel, Predoc III , November 28, 2018

Page 44: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Spoiler

Positive momentum can be bad for adversarial games

Practice that was very common when GANs were first invented.

→ Recent work reduced the momentum parameter.→ Not an accident

Gauthier Gidel, Predoc III , November 28, 2018

Page 45: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Momentum on games

Fixed point operator requires a state augmentation:(because we need previous iterate)

Recall Polyak’s momentum (on top of simultaneous grad. desc.):

Gauthier Gidel, Predoc III , November 28, 2018

Page 46: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

A Variational Inequality Perspective on GANs

Based on an ICLR submission [Gidel et al. 2018a]. Joint work with Hugo Berard, Gaëtan Vignoud, Pascal Vincent, Simon Lacoste-Julien

Page 47: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Quick recap on Generative Adversarial Networks (GANs)

(and two-player games)

Page 48: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Generative Adversarial Networks (GANs)

Fake Data

True Data

GeneratorNoise

DiscriminatorFakeorReal

[Goodfelow et al. NIPS 2014]

Gauthier Gidel, Predoc III , November 28, 2018

Page 49: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Generative Adversarial Networks (GANs)

Discriminator Generator

If D is non-parametric:

[Goodfelow et al. NIPS 2014]

Non-saturating GAN: “much stronger gradient in early learning”Loss of Generator Loss of Discriminator

Gauthier Gidel, Predoc III , November 28, 2018

Page 50: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Two-player Games

Non zero-sum game if we do not have:

Player 2Player 1

Example: Non-saturating GAN: [Goodfellow et al. 2014]

Loss of Generator Loss of Discriminator

Gauthier Gidel, Predoc III , November 28, 2018

Page 51: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

GANs as a Variational InequalityTakeaways:

- GAN can be formulated as a Variational Inequality.

- Encompass most of GANs formulations.

- Standard algorithms from Variational Inequality can be used for GANs.

- Theoretical Guarantees (for convex and stochastic cost functions).

Gauthier Gidel, Predoc III , November 28, 2018

Page 52: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Standard Algorithms from Variational InequalityMethod 1: Averaging - Converge even for “cycling behavior”.

- Easy to implement. (out of the training loop)- Can be combined with any method.

General Online averaging:

Example 1: Uniform averaging

Example 2: Exponential moving averaging (EMA)

Gauthier Gidel, Predoc III , November 28, 2018

Page 53: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Standard Algorithms from Variational InequalityMethod 1: Averaging

Simple Minmax problem:

Gauthier Gidel, Predoc III , November 28, 2018

Page 54: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Standard Algorithms from Variational InequalityMethod 1: Averaging

Simple Minmax problem:

Gauthier Gidel, Predoc III , November 28, 2018

Page 55: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Standard Algorithms from Variational InequalityMethod 1: Averaging

Gauthier Gidel, Predoc III , November 28, 2018

Page 56: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Standard Algorithms from Variational InequalityMethod 2: Extragradient

- Step 1:

- Step 2:

Intuition:

1. Game prespective: Look one step in the future and anticipate next move of adversary.

2. Euler’s method: Extrapolation is close to an implicit method because

- Standard in the literature.- Does not require averaging.- Theoretically and empirically

faster.

Gauthier Gidel, Predoc III , November 28, 2018

Page 57: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Standard Algorithms from Variational InequalityMethod 2: Extragradient

Intuition: Extrapolation is close to an implicit method because

Unknown:Require to solve a non-linear system

Gauthier Gidel, Predoc III , November 28, 2018

Page 58: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Standard Algorithms from Variational InequalityMethod 2: Extragradient Intuition: Extrapolation is close to an implicit method

*

*

almost the same

Gauthier Gidel, Predoc III , November 28, 2018

Page 59: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Problem: Extragradient requires to compute two gradients at each step.

Solution: Extrapolation from the past Re-use gradient.

- Step 1: Re-use from previous iteration.

- Step 2: (same as extragradient).

Extrapolation from the past: Re-using the gradients

Gauthier Gidel, Predoc III , November 28, 2018

Page 60: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Extrapolation from the past: Re-using the gradients

Problem: Extragradient requires to compute two gradients at each step.

Solution: Extrapolation from the past Re-use gradient.

- Step 1: Re-use from previous iteration.

- Step 2: (same as extragradient).

New Method !!!Related to [Daskalakis et al., 2018]

Gauthier Gidel, Predoc III , November 28, 2018

Page 61: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

step-size = 0.2 step-size = 0.5

Page 62: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Experimental Results

Page 63: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Experimental Results

Bilinear Stochastic Objective:

Gauthier Gidel, Predoc III , November 28, 2018

Page 64: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Experimental Results: WGAN (DCGAN) on CIFAR10

Inception Score on CIFAR10

Extragradient Methods

Inception Score vs nb of generator updates

Averaging

Gauthier Gidel, Predoc III , November 28, 2018

Page 65: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Extrapolation(Adam style)

Update(Adam style)

Page 66: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Experimental Results: WGAN-GP (ResNet) on CIFAR10

Extragradient Methods Averaging

Inception Score vs Number of

Gauthier Gidel, Predoc III , November 28, 2018

Page 67: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Conclusion

Page 68: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

- Training of adversarial formulations has been a recurrent issue in modern ML.

- Impact of non-convexity and stochasticity are less understood than in the single objective minimization.

- A better understanding of this framework is key to design new optimization algorithms.

- We provided tools to better understand saddle point problem, multi-player games and more generally variational inequalities.

- However, we just scratched the surface .

Gauthier Gidel, Mila Tea Talk, October 26, 2018

Page 69: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Thank you !

Gaëtan VignoudHugo Berard

Pascal VincentSimon

Lacoste-Julien Tony Jebara

Reyhane Askari Hemmat Gabriel Huang Rémi Le priol

Ioannis Mitliagkas

Mohammad Pezeshki

Page 70: Efficient Saddle Point Optimization for Modern Machine ... · Efficient Saddle Point Optimization for Modern Machine Learning Prédoc III - Gauthier Gidel Jury: Président : Yoshua

Bibliography- Arjovsky, Martin, Soumith Chintala, and Léon Bottou. "Wasserstein gan." in ICML 2017.

- Dunn, Joseph C. "Rates of convergence for conditional gradient algorithms near singular and nonsingular extremals." SIAM Journal on Control and Optimization” 1979

- Goodfellow, Ian, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. "Generative adversarial nets." In Advances in neural information processing systems 2014.

- Gidel, Gauthier, Tony Jebara, and Simon Lacoste-Julien. "Frank-wolfe algorithms for saddle point problems." in AISTATS 2017.

- Gidel, Gauthier, Hugo Berard, Gaëtan Vignoud, Pascal Vincent, and Simon Lacoste-Julien. "A Variational Inequality Perspective on Generative Adversarial Nets." arXiv preprint arXiv:1802.10551 (2018).

- Gidel, Gauthier, Reyhane Askari Hemmat, Mohammad Pezeshki, Rémi Le priol, Gabriel Huang, Simon Lacoste-Julien, and Ioannis Mitliagkas. "Negative momentum for improved game dynamics." arXiv preprint arXiv:1807.04740 (2018).

- Hammond, Janice H. "Solving asymmetric variational inequality problems and systems of equations with generalized nonlinear programming algorithms." PhD diss., Massachusetts Institute of Technology, 1984.

Gauthier Gidel, Mila Tea Talk, October 26, 2018