Deep Boltzman machines Paper by : R. Salakhutdinov, G. Hinton Presenter : Roozbeh Gholizadeh.
-
Upload
griffin-welch -
Category
Documents
-
view
242 -
download
1
Transcript of Deep Boltzman machines Paper by : R. Salakhutdinov, G. Hinton Presenter : Roozbeh Gholizadeh.
![Page 1: Deep Boltzman machines Paper by : R. Salakhutdinov, G. Hinton Presenter : Roozbeh Gholizadeh.](https://reader031.fdocuments.net/reader031/viewer/2022031914/56649d765503460f94a5785b/html5/thumbnails/1.jpg)
Deep Boltzman machinesPaper by : R. Salakhutdinov, G. Hinton
Presenter : Roozbeh Gholizadeh
![Page 2: Deep Boltzman machines Paper by : R. Salakhutdinov, G. Hinton Presenter : Roozbeh Gholizadeh.](https://reader031.fdocuments.net/reader031/viewer/2022031914/56649d765503460f94a5785b/html5/thumbnails/2.jpg)
Outline
Problems with some other methods!
Energy based models
Boltzmann machine
Restricted Boltzmann machine
Deep Boltzmann machine
![Page 3: Deep Boltzman machines Paper by : R. Salakhutdinov, G. Hinton Presenter : Roozbeh Gholizadeh.](https://reader031.fdocuments.net/reader031/viewer/2022031914/56649d765503460f94a5785b/html5/thumbnails/3.jpg)
Problems with other methods!
Supervised learning need labeled data.
Amount of information restricted by labels!
Finding and knowing abnormalities before ever seeing them such as some conditions in a nuclear power plant.
So Instead of learning p(label | data) learn p(data)
![Page 4: Deep Boltzman machines Paper by : R. Salakhutdinov, G. Hinton Presenter : Roozbeh Gholizadeh.](https://reader031.fdocuments.net/reader031/viewer/2022031914/56649d765503460f94a5785b/html5/thumbnails/4.jpg)
Energy Based Models
Some Energy function is defined. Energy function shows score (scalar value) assigned to a configuration.
Ex. , Boltzman (Gibbs) Distribution.
, integral of numerator over all observations.
Parameters that lead to lower energy are desired.
![Page 5: Deep Boltzman machines Paper by : R. Salakhutdinov, G. Hinton Presenter : Roozbeh Gholizadeh.](https://reader031.fdocuments.net/reader031/viewer/2022031914/56649d765503460f94a5785b/html5/thumbnails/5.jpg)
Boltzmann machine
Markov random field (MRF) with hidden variables.
Undirected edges representing dependency. Weights can be assigned.
![Page 6: Deep Boltzman machines Paper by : R. Salakhutdinov, G. Hinton Presenter : Roozbeh Gholizadeh.](https://reader031.fdocuments.net/reader031/viewer/2022031914/56649d765503460f94a5785b/html5/thumbnails/6.jpg)
Conditional distributions over hidden and visible units:
![Page 7: Deep Boltzman machines Paper by : R. Salakhutdinov, G. Hinton Presenter : Roozbeh Gholizadeh.](https://reader031.fdocuments.net/reader031/viewer/2022031914/56649d765503460f94a5785b/html5/thumbnails/7.jpg)
Learning process
Parameters update:
Exact maximum likelihood learning is intractable.
Use Gibbs sampling to approximate.
Run 2 separate Markov chains to approximate them.
![Page 8: Deep Boltzman machines Paper by : R. Salakhutdinov, G. Hinton Presenter : Roozbeh Gholizadeh.](https://reader031.fdocuments.net/reader031/viewer/2022031914/56649d765503460f94a5785b/html5/thumbnails/8.jpg)
Restricted Boltzmann Machine
Setting .
Without visible-visible and hidden-hidden connections!
Learning carried out efficiently using Contrastive Divergence (CD)
Or Stochastic approximation procedure (SAP)
Variational Approach to estimating data-dependent expectations.
![Page 9: Deep Boltzman machines Paper by : R. Salakhutdinov, G. Hinton Presenter : Roozbeh Gholizadeh.](https://reader031.fdocuments.net/reader031/viewer/2022031914/56649d765503460f94a5785b/html5/thumbnails/9.jpg)
Stochastic approximation procedure (SAP)
and : current parameters and state
and updated sequentially as :
Given , a new state sampled from a transition operator that leaves invariant.
New parameter obtained by replacing intractable model’s expectation by expectation with respect to
Learning rate has to decrease with time, for example by .
![Page 10: Deep Boltzman machines Paper by : R. Salakhutdinov, G. Hinton Presenter : Roozbeh Gholizadeh.](https://reader031.fdocuments.net/reader031/viewer/2022031914/56649d765503460f94a5785b/html5/thumbnails/10.jpg)
Why go deep?
![Page 11: Deep Boltzman machines Paper by : R. Salakhutdinov, G. Hinton Presenter : Roozbeh Gholizadeh.](https://reader031.fdocuments.net/reader031/viewer/2022031914/56649d765503460f94a5785b/html5/thumbnails/11.jpg)
Why go deep?
Deep architectures are representationally efficient, fewer computational units for same function.
Allow for showing a hierarchy.
Non-local generalization
Easier to monitor what is being learn
and guide the machine.
![Page 12: Deep Boltzman machines Paper by : R. Salakhutdinov, G. Hinton Presenter : Roozbeh Gholizadeh.](https://reader031.fdocuments.net/reader031/viewer/2022031914/56649d765503460f94a5785b/html5/thumbnails/12.jpg)
Deep Boltzmann Machine
Undirected connection between all layers.
Conditional distributions over visible and hidden:”
![Page 13: Deep Boltzman machines Paper by : R. Salakhutdinov, G. Hinton Presenter : Roozbeh Gholizadeh.](https://reader031.fdocuments.net/reader031/viewer/2022031914/56649d765503460f94a5785b/html5/thumbnails/13.jpg)
Pretraining (greedy layerwise)
![Page 14: Deep Boltzman machines Paper by : R. Salakhutdinov, G. Hinton Presenter : Roozbeh Gholizadeh.](https://reader031.fdocuments.net/reader031/viewer/2022031914/56649d765503460f94a5785b/html5/thumbnails/14.jpg)
MNIST dataset
![Page 15: Deep Boltzman machines Paper by : R. Salakhutdinov, G. Hinton Presenter : Roozbeh Gholizadeh.](https://reader031.fdocuments.net/reader031/viewer/2022031914/56649d765503460f94a5785b/html5/thumbnails/15.jpg)
NORB
Misclassification Error rate:DBM : 10.8% , SVM:11.6% , logistic regression: 22.5% , K-nearest neighbors : 18.4%
![Page 16: Deep Boltzman machines Paper by : R. Salakhutdinov, G. Hinton Presenter : Roozbeh Gholizadeh.](https://reader031.fdocuments.net/reader031/viewer/2022031914/56649d765503460f94a5785b/html5/thumbnails/16.jpg)
Thank you!