Restricted Boltzman Machine (RBM) presentation of fundamental theory

93
M&S Restricted Boltzman Machine - Theory - Seongwon Hwang

Transcript of Restricted Boltzman Machine (RBM) presentation of fundamental theory

Page 1: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Restricted Boltzman Machine - Theory -

Seongwon Hwang

Page 2: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Energy Based Model

Page 3: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

1. Scalar Function

θV cos0

θgtθV sin0

jgtθViθVV )sin(cos 002

2mvmghE

*Total Energy = Potential + Kinetic Energy

Page 4: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

2. Principle of Minimum Energy

E

Principle of Maximum Entropy

Principle of Minimum Energy

Equilibrium at fixed internal energy

Equilibrium at fixed entropy

EquilibriumUnstable

S

Page 5: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

In Neural Network

Supervised Model

),,( jiij yxWEix

jy

Input variables

Output variables Energy = - Correlation

ijW

Unsupervised Model

ix Input variables

),( iij xWEEnergy with input variables =

- Correlation ijW

Page 6: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

In Neural Network

Unsupervised Model with Hidden units

),,( jiij hvWEiv

jh

Visible variables

Hidden variables Energy = - Correlation

ijW

Energy Correlation

Page 7: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

In Neural Network

Learning in unsupervised model

x

),( xWE

dataxminx x

),( xWE

datax

'WW

Page 8: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

How we make energy in neural network?

Hopfield Neural Network

Page 9: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Two constraints

1. Symmetric weight between neurons

2. Asynchronously learning required for stable state

jiij WW 3x

1x 2x

3x

1x 2x

Page 10: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Two constraints

1. Symmetric weight between neurons

2. Asynchronously learning required for stable state

jiij WW 3x

1x 2x

3x

1x 2x

Page 11: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Two constraints

1. Symmetric weight between neurons

2. Asynchronously learning required for stable state

jiij WW 3x

1x 2x

3x

1x 2x

Page 12: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Two constraints

1. Symmetric weight between neurons

2. Asynchronously learning required for stable state

jiij WW 3x

1x 2x

3x

1x 2x

3x

1x 2x

Randomly activate node

Page 13: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Define energy by Hopfield

1x

2x

3x

5x

4x

ji

ijji wxxE

2

3

11

2

4 3

} 1 ,0 {ix

Page 14: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Example for intuition

1x

2x

3x

5x

4x

2

3

11

2

4 3

11 x

} 1 ,0 {ix

12 x 13 x 04 x 05 x

01 x 12 x 13 x 04 x 15 x

7E6E

... ...

Page 15: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Application - Data store

1

2x

3x

5x

4x

2

3

11

2

4 3

} 1 ,0 {ix

Page 16: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Application - Data store

1

1

1

0

0

2

3

11

2

4 3

} 1 ,0 {ix

Page 17: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Learning in Hopfield Network

1x

2x

3x

5x

4x

12w

15w

13w45w

35w

23w34w

} 1 ,0 {ix

ji

ijji wxxE

Several dataset

ijij wwWeight uptdate

Page 18: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Boltzman Machine

Page 19: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Overview

Energy Correlation

Probability Correlation

Hopfield Neural Network

Boltzman Machine

Page 20: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Overview

Probablity Correlation

Boltzman Machine

j

vE

vE

i j

i

eevvP )(

)(

)(

Energy

Boltzman Distribution

Page 21: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Thermalphysics for boltzman distribution

Page 22: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Macrostate Vs. Microstate

TH H

10 100 500

HH THT HTH TTT HHT THH HTT T

Total number of microstate: 8

Microstate 1

Microstate 2

Microstate 3

Position, Velocity…

Page 23: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Macrostate Vs. Microstate

TH H

10 100 500

HH THT HTH TTT HHT THH HTT T

Total number of macrostate : 4

1

T

2

03

Page 24: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Macrostate Vs. Microstate

TH H

10 100 500

HH THT HTH TTT HHT THH HTT T

Total number of macrostate : 4

2

H

1

30

Temperature, Pressure…

Page 25: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Canonical Ensemble (NVT Ensemble)

N, V, T, Fixed ensemble of microstates

0 , , , ETVN

Page 26: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Canonical Ensemble (NVT Ensemble)

N, V, T, Fixed ensemble of microstates

0 , , , ETVN1 , , , ETVN

Page 27: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Boltzman Distribution

j

TkE

TkE

i Bj

Bi

eeEP /

/

)(

!...!!!

210 NNNNW

SW ln

Maximum Entropy!

Number of cases Number of particles of total system

ith microstate’s number of particle

...) ,0 ,0 ,0 ,(N

...) ,0 ,0 ,2 ,2( N

...) ,0 ,1 ,2 ,3( N...

Number of cases

Page 28: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Boltzman Distribution

j

TkE

TkE

i Bj

Bi

eeEP /

/

)()( iEP

iE

Page 29: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1

0 1 0 12 0 0 0 1 0 0 0 0 30 0 1 0 0 0 0 1 0 00 0 0 0 4 0 0 6 0 00 0 2 0 0 0 3 2 0 10 3 0 0 7 1 0 0 0 00 0 1 0 4 5 3 0 1 00 0 0 0 2 0 0 0 0 00 0 0 0 0 1 0 0 0 1

0 1 2 3 4 5 6 7

2

j

TkE

TkE

i Bj

Bi

eeEP /

/

)(

1 1 1 1 1 1 1 1 1 11 1 2 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 11 1 1 1 1 0 1 1 1 11 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1

Page 30: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

H

2 0 0 0 1 0 0 0 0 30 0 1 0 0 0 0 1 0 00 0 0 0 4 0 0 6 0 00 0 2 0 0 0 3 2 0 10 3 0 0 7 1 0 0 0 00 0 1 0 4 5 3 0 1 00 0 0 0 2 0 0 0 0 00 0 0 0 0 1 0 0 0 1

0 1 2 3 4 5 6 7

Intuition for connection between Physics and Network

Page 31: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

H

2 0 0 0 1 0 0 0 0 30 0 1 0 0 0 0 1 0 00 0 0 0 4 0 0 6 0 00 0 2 0 0 0 3 2 0 10 3 0 0 7 1 0 0 0 00 0 1 0 4 5 3 0 1 00 0 0 0 2 0 0 0 0 00 0 0 0 0 1 0 0 0 1

0 1 2 3 4 5 6 7

Intuition for connection between Physics and Network

As energy changesChanges of molecular struture

Changes of configuration of network

Physics

Network

Page 32: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Helmholtz free energy

j

Eβbb

jeTkZTkF )ln(ln

= Free energy associated with Canonical Ensemble

Page 33: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Overview

Probability

j

vE

vE

i j

i

eevP )(

)(

)(

Energy

Configurations...) ,1 ,0 ,1 ,0(1 v ...) ,1 ,0 ,1 ,1(2 v

N2

N-dimensional binary data

...

1v 2v 3v 4v

5v 6v 7v

Page 34: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Overview

Probability

lk

hvE

hvE

ii lk

ji

eehvP

.

) ,(

) ,(

) ,(

Energy

1v 2v 3v 4v

1h 2h 3h

Page 35: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Restricted Boltzman Machine

Page 36: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Restriction – NO connections between H and V respectvely

Boltzman Machine

Restricted Boltzman Machine

1v 2v 3v 4v

1h 2h 3h

1v 2v 3v 4v

1h 2h 3h

Page 37: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Restriction – NO connections between H and V respectvely

Restricted Boltzman Machine

1v 2v 3v 4v

1h 2h 3h

Conditional Independent!

)|()|()|,( CBPCAPCBAP

)|()|()|,( 1111 vhPvhPvhhP

j

j vhPvhP )|()|(

i

i hvPhvP )|()|(

General Form

Page 38: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Energy from Hopfield Network

i j j

jji

iijiij hcvbhvwhvE ),(

1v 2v 3v 4v

1h 2h 3h

hv

hvE

hvE

j

vE

vE

i eehvP

eevvP

j

i

,

),(

),(

)(

)(

),()(

v‘ biash‘ bias

Page 39: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Two Important Conditional Probabilites! – First

i

jiijj cvwσvhP )()|1(

1v 2v 3v 4v

1h 2h 3h

x

x

eexσ

1

)(

Page 40: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Two Important Conditional Probabilites! – Second

1v 2v 3v 4v

1h 2h 3h

i

ijiji bhwσhvP )()|1(

Page 41: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Generative Vs. Discriminative Model

),( ),|( yxPyxP

Y

X

Y

X

)|( xyPEX) Gaussians, Sigmoid Belief Networks,

Bayesian NetworksEX) Neural Network, Logistic Regression,

Support Vector Machine

<Generative Model> <Discriminative Model>

RBM

Page 42: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Maximum Likelihood Estimator

Population

Sample

Maximizing the possibility based on observed samples to estimate unobserved parameters of population

Page 43: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Maximum Likelihood Estimator

EX) What is the probability of coin in the case of head?

H H T

322 )1()|()( ppppθxPθL

032)( 2

ppppL

32

p

Page 44: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Learning in RBM

Cost = Negative Log-Likelihood (NLL)

)|(ln)|( θvPvθNLL hv

hvE

h

hvE ee,

),(),( lnln

model),(),( )|(

hvE

θhvE

θθvθNLL

data

Gradient Discent for NLL<...> Expectation

Free Energy 1 Free Energy ∞

Positive Phase Negative Phase

Page 45: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Learning in RBM

model),(),( )|(

hvE

θhvE

θθvθNLL

data

Gradient Discent for NLL

1 0 1 1

1h 2h 3h} 1 ,0 {jh

Easy to compute!

Page 46: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Learning in RBM

model),(),( )|(

hvE

θhvE

θθvθNLL

data

Gradient Discent for NLL

1v 2v 3v 4v

1h 2h 3h} 1 ,0 {jh

Hard to compute!

} 1 ,0 {iv

mn2

Total number of possible configurations:

Page 47: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Markov Chain Monte Carlo (MCMC)

1. Markov Chain

First-order Markov chain is that next state depends only on immediately preceding one, Second or higher order’s next state depends on two or more preceding ones.

Page 48: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Markov Chain Monte Carlo (MCMC)

2. Monte Carlo – Compute the value statistically by using random numbers

samples of number Totalcircle the in sample of Number

22 yx 1<Evalutation>

VS.

Sampling

EX) Compute circular constant

Page 49: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Gibbs Sampling

1. Set up initial values randomly

Multi-Dimensional Variants

... , , 321 xxx ...) , , ,( 321 xxxp

Joint Probability or Conditional Probability

or

2. Sampling with conditional distribution

3. Perform this to reach stationary value

- Algorithm -

...) ,0 ,1 ,1 ,0 ,0 ,0 ,1( ...) ,0 ,1 ,1 ,1 ,0 ,1 ,0( ...) ,1 ,1 ,1 ,0 ,1 ,0 ,1(0r 1r 2r

)|( 01 rrp )|( 12 rrp ...

)|( ii xxp

Page 50: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

k-th Contrastive Divergence

1. Usage of real data as initial values

2. kth sample is equal to expectation of desirable distribution

- Characteristics -

...) ,0 ,1 ,1 ,0 ,0 ,0 ,1( ...) ,0 ,1 ,1 ,1 ,0 ,1 ,0( ...) ,1 ,1 ,1 ,0 ,1 ,0 ,1(

data

1r 2r)|( 01 rrp )|( 12 rrp

2k

Page 51: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

k-th Contrastive Divergence

1. Usage of real data as initial values

2. kth sample is equal to expectation of desirable distribution

3. That k is 1 is enough to be converged since the real data is used as initial valuesData

- Characteristics -

...) ,0 ,1 ,1 ,0 ,0 ,0 ,1( ...) ,0 ,1 ,1 ,1 ,0 ,1 ,0(

data

1r)|( 01 rrp

Page 52: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Learning in RBM

model),(),( )|(

hvE

θhvE

θθvθNLL

data

Gradient Discent for NLL

1v 2v 3v 4v

1h 2h 3h} 1 ,0 {jh

Hard to compute!

} 1 ,0 {iv

mn2

Total number of possible configurations:

Page 53: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Approximation in RBM

model),(

hvEθ

m

mxfm

xf )(1)( model

)()()(1 1 kk

m

m xfxfxfm

MCMC_Gibbs sampling

CD_k=1

Page 54: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Sampling Algorithm in RBM

1st Step

1 0 1 1

1h 2h 3h

- Usage of real data as an initial value

dataInput

} 1 ,0 {jh

} 1 ,0 {iv

Page 55: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Sampling Algorithm in RBM

2nd Step

1 0 1 1

1 2h 3h

- Sampling each hidden unit with conditional probability starting from initial values

dataInput

} 1 ,0 {jh

} 1 ,0 {iv

Page 56: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Two Important Conditional Probabilites! – First

i

jiijj cvwσvhP )()|1(

1v 2v 3v 4v

1h 2h 3h

x

x

eexσ

1

)(

Page 57: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Sampling Algorithm in RBM

2nd Step

1 0 1 1

1 2h 3h

- Sampling each hidden unit with conditional probability starting from initial values

dataInput

} 1 ,0 {jh

} 1 ,0 {iv

i

jiijj cvwσvhP )()|1(

Page 58: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Sampling Algorithm in RBM

2nd Step

1 0 1 1

1 0 3h

- Sampling each hidden unit with conditional probability starting from initial values

dataInput

} 1 ,0 {jh

} 1 ,0 {iv

i

jiijj cvwσvhP )()|1(

Page 59: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Sampling Algorithm in RBM

2nd Step

1 0 1 1

1 0 1

- Sampling each hidden unit with conditional probability starting from initial values

dataInput

} 1 ,0 {jh

} 1 ,0 {iv

i

jiijj cvwσvhP )()|1(

Page 60: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Sampling Algorithm in RBM

3rd Step

0 0 1 1

1 0 1

- Sampling each input unit with conditional probability starting from sampled hidden units

} 1 ,0 {jh

} 1 ,0 {iv

Reconstruction! Generative Model!

Page 61: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Two Important Conditional Probabilites! – Second

1v 2v 3v 4v

1h 2h 3h

i

ijiji bhwσhvP )()|1(

Page 62: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Sampling Algorithm in RBM

3rd Step

0 0 1 1

1 0 1

- Sampling each input unit with conditional probability starting from sampled hidden units

} 1 ,0 {jh

} 1 ,0 {iv

Reconstruction! Generative Model!

i

ijiji bhwσhvP )()|1(

Page 63: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Sampling Algorithm in RBM

3rd Step

0 0 1 1

1 0 1

- Sampling each input unit with conditional probability starting from sampled hidden units

} 1 ,0 {jh

} 1 ,0 {iv

Reconstruction! Generative Model!

i

ijiji bhwσhvP )()|1(

Page 64: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Sampling Algorithm in RBM

3rd Step

0 0 1 1

1 0 1

- Sampling each input unit with conditional probability starting from sampled hidden units

} 1 ,0 {jh

} 1 ,0 {iv

Reconstruction! Generative Model!

i

ijiji bhwσhvP )()|1(

Page 65: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Sampling Algorithm in RBM

3rd Step

0 0 1 0

1 0 1

- Sampling each input unit with conditional probability starting from sampled hidden units

} 1 ,0 {jh

} 1 ,0 {iv

Reconstruction! Generative Model!

i

ijiji bhwσhvP )()|1(CD_k=1

Page 66: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Sampling Algorithm in RBM

4th Step - k times performing CD_k

1v 2v ...v

1h 2h ...h

1v 2v ...v

1h 2h ...h

…1h 2h ...h

1v 2v ...v

t = 0 t = 1 t = ∞ ≈ k

Page 67: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Sampling Algorithm in RBM

4th Step - k times performing CD_k=1

1v 2v ...v

1h 2h ...h

1v 2v ...v

1h 2h ...h

…1h 2h ...h

1v 2v ...v

t = 0 t = 1 t = ∞ ≈ k

Data Model

Page 68: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Learning in RBM

model),(),( )|(

hvE

θhvE

θθvθNLL

data

Gradient Discent for NLL<...> Expectation

cbwθ ,,

jiij

hvw

hvE

),(

ii

vb

hvE

),(

jj

hc

hvE

),(

i j j

jji

iijiij hcvbhvwhvE ),(

Page 69: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Learning in RBM

model),(),( )|(

hvE

θhvE

θθvθNLL

data

Gradient Discent for NLL<...> Expectation

cbwθ ,,

jiij

hvw

hvE

),(

ii

vb

hvE

),(

jj

hc

hvE

),(

)( model jidatajiwij hvhvηwΔ

)( model idataibi vvηbΔ

)( model jdatajcj hhηcΔ

Page 70: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Learning in RBM

model),(),( )|(

hvE

θhvE

θθvθNLL

data

Gradient Discent for NLL<...> Expectation

cbwθ ,,

)( model jidatajiwij hvhvηwΔ

)( model idataibi vvηbΔ

)( model jdatajcj hhηcΔ

i

ijiijji vcvwσhv )(

i

jiijj cvwσh )(

ii vv

Page 71: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Learning in RBM

model),(),( )|(

hvE

θhvE

θθvθNLL

data

Gradient Discent for NLL<...> Expectation

cbwθ ,,

ijtij

tij wΔww 1

iti

ti bΔbb 1

jtj

tj cΔcc 1

Page 72: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Learning in RBM

model),(),( )|(

hvE

θhvE

θθvθNLL

data

Gradient Discent for NLL<...> Expectation

cbwθ ,,

)( )(1 kiib

ti

ti vvηbb

))()(( )()(1

i

kij

kiij

iijiijw

tij

tij vcvwσvcvwσηww

))()(( )(1

ij

kiij

ijiijc

tj

tj cvwσcvwσηcc

ModelData

Page 73: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Intuition for RBM

Cost = Negative Log-Likelihood (NLL)

)|( vθNLL hv

hvE

h

hvE ee,

),(),( lnln

Model

Data ),( hvE

),( hvEEnergy Surface in global configurations

Datapoint + Hidden(datapoint)

Reconstruction + Hidden(reconstruction)

Sampling

Page 74: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Intuition for RBM

Cost = Negative Log-Likelihood (NLL)

)|( vθNLL hv

hvE

h

hvE ee,

),(),( lnln

Model

Data ),( hvE

),( hvEEnergy Surface in global configurations

Page 75: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Intuition for RBM

Cost = Negative Log-Likelihood (NLL)

)|( vθNLL hv

hvE

h

hvE ee,

),(),( lnln

Sampling Direction

Energy Surface in global configurations

Sampling

Global Minimum

Global MinimumDatapoint

Page 76: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Intuition for RBM

Cost = Negative Log-Likelihood (NLL)

)|( vθNLL hv

hvE

h

hvE ee,

),(),( lnln

Sampling Direction

Datapoint

j

vE

vE

i j

i

eevvP )(

)(

)(

ith configuration

Overall configuration

Page 77: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Intuition for RBM

Cost = Negative Log-Likelihood (NLL)

)|( vθNLL hv

hvE

h

hvE ee,

),(),( lnln

Sampling Direction

Datapoint

j

vE

vE

i j

i

eevvP )(

)(

)(

ith configuration

Overall configuration

Page 78: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Intuition for RBM

Sampling Direction

Global Minimum

1v 2v ...v

1h 2h ...h

1v 2v ...v

1h 2h ...h

t = 0 t = 1

i

jiijj cvwσvhP )()|1(

i

ijiji bhwσhvP )()|1(

Boltzman Distribution

lk

hvE

hvE

ii lk

ji

eehvP

.

) ,(

) ,(

) ,(

)( iEP

iE

Energy

Page 79: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Intuition for RBM

Sampling Direction

Global Minimum

1v 2v ...v

1h 2h ...h

1v 2v ...v

1h 2h ...h

t = 0 t = 1

i

jiijj cvwσvhP )()|1(

i

ijiji bhwσhvP )()|1(

Boltzman Distribution

lk

hvE

hvE

ii lk

ji

eehvP

.

) ,(

) ,(

) ,(

)( iEP

iE

Energy

Page 80: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Intuition for RBM

Sampling Direction

Global Minimum

1v 2v ...v

1h 2h ...h

1v 2v ...v

1h 2h ...h

t = 0 t = 1

i

jiijj cvwσvhP )()|1(

i

ijiji bhwσhvP )()|1(

Boltzman Distribution

lk

hvE

hvE

ii lk

ji

eehvP

.

) ,(

) ,(

) ,(

)( iEP

iE

Energy

Page 81: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Intuition for RBM

Sampling Direction

Global Minimum

1v 2v ...v

1h 2h ...h

1v 2v ...v

1h 2h ...h

t = 0 t = 1

i

jiijj cvwσvhP )()|1(

i

ijiji bhwσhvP )()|1(

Boltzman Distribution

lk

hvE

hvE

ii lk

ji

eehvP

.

) ,(

) ,(

) ,(

)( iEP

iE

Energy

Page 82: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Intuition for RBM

Sampling Direction

Global Minimum

1v 2v ...v

1h 2h ...h

1v 2v ...v

1h 2h ...h

t = 0 t = 1

…1v 2v ...v

1h 2h ...h

t = ∞

Energy

Energy Surface in global configurations

Sampling

Global Minimum

Page 83: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Intuition for RBM

Contrastive Divergence (CD)

PCD Vs. CD

Global Minimum

Page 84: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Intuition for RBM

Contrastive Divergence (CD)

PCD Vs. CD

Global Minimum

Page 85: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Intuition for RBM

Contrastive Divergence (CD)

PCD Vs. CD

Global Minimum

Page 86: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Intuition for RBM

Persistent Contrastive Divergence (PCD)

PCD Vs. CD

Global Minimum

Page 87: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Intuition for RBM

Persistent Contrastive Divergence (PCD)

PCD Vs. CD

Global Minimum

Previous sample point

Page 88: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Intuition for RBM

Persistent Contrastive Divergence (PCD)

PCD Vs. CD

Global Minimum

Previous sample point

Page 89: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Intuition for RBM

Persistent Contrastive Divergence (PCD)

PCD Vs. CD

Global Minimum

Previous sample point

Winner is PCD!

Page 90: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Practice

Input Data 1th epoch Reconstruction

Page 91: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Practice

11th epoch Reconstruction 61th epoch Reconstruction

Page 92: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

In Reality – Unsupervised Pretraining

1v 2v 3v ...v

1h 2h ...h

1h 2h 3h ...h

1y 2y ...y

Pretraining!

Page 93: Restricted Boltzman Machine (RBM) presentation of fundamental theory

M&S

Thank you!