f-GANs in an Information Geometric...

f-GANsinanInformationGeometricNutshell!

RichardNock,ZacCranko,AdityaKrishnaMenon,LizhenQu,RobertC.Williamson

(i) complete information-theoretic layer of f-GANs (Nowozin et al.’16), (ii) provide equivalent information-geometric layer showing the fitting power of f-GANs (iii) show the concinnity of deep architectures to the information-geometric layer, (iv) use it to devise improvements to the generator / discriminator in GAN game

tagline

Longer ArXiv version (# 1707.04385): more extensive treatment of the vig-f-GAN identity, analysis of penalty , vig-f-GAN identity in the expected utility theory, relationships with feature matching, etc.

If (PkQ).= EX⇠Q

hf⇣

P (X)Q(X)

⌘i

f : R+ ! R f(1) = 0convex,

Information theory f-divergence

D'(✓k⇢).= '(✓)� '(⇢)� (✓ � ⇢)>r'(⇢)

convex differentiable'

Information geometry Bregman divergence

log�(z).=

R z1

1�(t)dt

exp�(z).= 1 +

R z0 �(t)dt Z

.=

RX�(P�,C(x|✓,�))dµ(x)

P̃�,C.= 1

Z · �(P�,C)

✓� : X ! Rd

C : ⇥ ! RP�,C(x|✓,�)

.= exp�(�(x)

>✓ � C(✓))

� : R+ ! R+non-decreasing

�(log�(z)).= �(z)with

signature

-exponential�

-logarithm�

-exponential family� -escort�

density:

cumulantsufficient statisticscoordinate

density:

normalisation

example:log� = logexponential familyexp� = exp

PC(x|✓,�) = exp(�(x)

>✓ � C(✓)) Ikl(P⇢kQ✓) = DC(✓k⇢)� = Id (= P̃C)

experiments: new generator/discriminator components3

deep architectures in the vig-f-GAN1

Rdl 3 �l(x).= v(wl�l�1(x) + bl) , 8l 2 {1, 2, ..., L} ,

�0(x).= x 2 X .

g(x).= v

out

(��L(x) + �)

g : X ! Rdstandard deep generator architecture , (inner) deep layersL

with

Suppose invertible, . Let . Then for any continuous signature , there exists activation and offsets ( ) such that for any output , the generator’s density satisfiesQg(z) = f(Q̃deep(x))

x

.= g

�1(z)

z

8l 2 {1, 2, ..., L}

bl 2 Rdv

vout

,�,wl

�net

8l 2 {1, 2, ..., L}Qg(z)

with

Hence, the deep generator architecture is able to fit complex escorts for particular choices of inner activation — but does this hold for popular s? Define to be strongly admissible iff and is , lowerbounded, strictly increasing, convex. It is weakly admissible iff , strongly admissible such that .

2

with

complete proper loss layer for the (vig-)f-GAN game

If (PkQ) / L (Q)

L (Q).= supT : X!R

⇢E

X⇠P[�` (+1, T (X))] + E

X⇠Q[�` (�1, T (X))]

�

Theorem

` (�1, z).= f?

⇣f 0

⇣ �1(z)

1� �1(z)

⌘⌘` (+1, z)

.= �f 0

⇣ �1(z)

1� �1(z)

⌘

`(+1, z).= �z`

x

(�1, z).= � log(�•) 1

Q̃(x)

(�z)

�•(t).= 1/��1(1/t)

: (0, 1) ! Rinvertible link functionloss function

Reid & Williamson’11

with -1 = fake, +1 = real andTheorem (A)

v v

“deep” sufficient statisticscoordinatecumulant -family�

Q̃deep(x).=

LY

l=1

dY

i=1

P̃�net,bl,i(x|wl,i,�l�1)

vdom(v) \ R+ 6= ; C1v 8✏ > 0

9v✏ kv � v✏kL1 < ✏

see paper for details

µ-ReLU(z).=

z+p

(1�µ)2+z2+µ�1

2

LSUN “tower” ( )µ = 0.4

MNIST

MNIST

Exp. A: replacing the sigmoid link by Matshushita’s in discriminator:

mat(z).= (1/2) · (1 + z/

p1 + z2)

Exp. B: replacing ReLU activation in the generator by strongly admissible generalization, -ReLUµ

Code: https://github.com/qulizhen/fgan info geometric

J(Q✓)

fgan(z).= z

log z �(z

+

1) log(z+

1)+

2log

2

�gan(z).= 1

log(1+ 1z )

GAN gamediscriminatorgenerator

DC(✓k⇢) + J(Q✓)

sup!{EX⇠P⇢ [T!(X)]� EX⇠Q̃✓[(� log�Q̃✓

)

?(T!(X))]}sup!{EX⇠P[T!(X)]� EX⇠Q✓ [f

?(T!(X))]}

KL�Q̃✓(Q̃✓kP⇢)If (PkQ) = =

==**

**

Information theory

Information geometry

Nowozin et al.’16

naturediscriminator generator naturediscriminator generator

vig-f-GAN

Nowozin et al.’16 show f -GAN(P, Q) = If (PkQ) we show (variational information-geometric f-GAN)f -GAN(P, escort(Q)) = D(✓k#) + Penalty(Q)In short: * *

ReLU = lim1 µ-ReLU

Laye

rs in

the

GA

N

(*=conditions apply)

v(A) holds for any strongly admissible . The following activations are (weakly or strongly) admissible: ELU, ReLU, leaky ReLU, Softplussee paper for more examples and details

Theorems v

f-GANs in an Information Geometric...

Documents

Transcript of f-GANs in an Information Geometric...