Generative Adversarial Networks (GANs) · where D is our discriminator, G is our generator, and and...
Transcript of Generative Adversarial Networks (GANs) · where D is our discriminator, G is our generator, and and...
Generative Adversarial Networks (GANs)APMA DRP Fall 2019Emily Reed
Table of Contents
1. Introduction to Neural Networksa. Universal approximation
2. Generative Adversarial Networksa. Basic setup and methodology b. Results
3. Wasserstein metric and Wasserstein GANs
Neural Networks
What are neural nets?
Universal Approximation
Given a continuous function and , there exists a function,
such that,
[ 1 ] G. Cybenko, Approximation by Superpositions of a Sigmoidal Function, Mathematics of Control, Signals and Systems, 2 (1989), pp. 303-314.
where is the output of a one_layer network with parameters .
A multilayer feedforward network has the general form,
GANs
Recent Applications● Deep Fakes ● Image to image translation● Realistic image generation● Artistic image generation
○ ganbreeder.app
https://www.cleanpng.com/png-artificial-neural-network-deep-learning-convolutio-2489986/download-png.html
Basic Setup of GANs
Samples from Prior
Distribution
Real Data
Generator
Discriminator Real or fake?
Mathematical Structure of GANs (1 / 3)
We define a function,
where D is our discriminator, G is our generator, and and are their corresponding parameters.
[ 2 ] E. A. Goodfellow, Ian, Generative Adversarial Networks, Advances in Neural Information Processing Systems (NIPS), (2014).
In training our adversarial network, we will compute,
We can see the training of our GAN to convergence as attempting to find the equilibrium of a two-player minimax game.
Mathematical Structure of GANs (2 / 3)
Generator
Takes in a random vector and produces a fake sample. Goal to minimize the following:
random vector
fake sample
likelihood that fake sample is real
Mathematical Structure of GANs (3 / 3)Discriminator
Outputs a likelihood that the given images are from the real dataset. Goal to maximize the following:
real data sample
likelihood that real sample is real
Goal
To study the mathematics behind and implement a generative adversarial network to produce realistic handwritten digits.
Samples from the MNIST dataset
https://medium.com/syncedreview/mnist-reborn-restored-and-expanded-additional-50k-training-samples-70c6f8a9e9a9
Approximation of Objective Function V ( 1 / 2 )
We approximate the expectations in V using Monte Carlo integration (i.e. average over samples of the distribution),
where are drawn from .
The same follows for the second expectation of the objective function,
where are drawn from the prior distribution .
Approximation of Objective Function V ( 2 / 2 )
Therefore, we define the approximation,
Optimization ( 1 / 2 )
Sub-problem 1
Apply gradient ascent,
Optimization ( 2 / 2 )
Sub-problem 2
Apply gradient descent,
Algorithm Pseudocode for number of training iterations:
● Sample minibatch from noise prior distribution , and sample minibatch from the data generating distribution .
● Apply SGD to sub-problem 1.
● Sample minibatch from noise prior distribution . ● Apply SGD to sub-problem 2.
Model Architecture ( 1 / 2 )
Discriminator
where
Model Architecture ( 2 / 2 )
Generator
where
Results
Convolutional GANConvolutional GAN with
Wasserstein Metric
Wasserstein Metric
Wasserstein Metric ( 1 / 2 )
We define the Wasserstein distance as,
which measures the smallest amount of work necessary to transform the mass into the mass .
Wasserstein Metric ( 2 / 2 )
We can also write the distance in the dual representation (Kantorovich-Rubinstein Duality),
which measures the difference in expectations of a “nice” function under the two distributions.
[ 3 ] Kantorovich, L. V., & Rubinstein, G. S. (1958). On a space of completely additive functions. Vestnik Leningrad. Univ, 13(7), 52-59.
Thank you for listening!
Special thanks to Justin Dong.