Comp 5013 Deep Learning Architectures

1

Comp 5013Deep Learning Architectures

Daniel L. SilverMarch, 2014

Y. Bengio - McGill

• 2009 Deep Learning Tutorial

• 2013 Deep Learning towards AI

• Deep Learning of Representations (Y. Bengio)– http://www.youtube.com/watch?v=4xsVFLnHC_0

http://www.youtube.com/watch?v=4xsVFLnHC_0

3

Deep Belief RBM Networks with Geoff Hinton

• Learning layers of features by stacking RBMs– http://www.youtube.com/watch?v=VRuQf3DjmfM

• Discriminative fine-tuning in DBN– http://www.youtube.com/watch?v=-I2pgcH02QM

• What happens during fine-tuning?– http://www.youtube.com/watch?v=yxMeeySrfDs

http://www.youtube.com/watch?v=VRuQf3DjmfM

http://www.youtube.com/watch?v=VRuQf3DjmfM

http://www.youtube.com/watch?v=-I2pgcH02QM

http://www.youtube.com/watch?v=-I2pgcH02QM

http://www.youtube.com/watch?v=yxMeeySrfDs

http://www.youtube.com/watch?v=yxMeeySrfDs

4

Deep Belief RBM Networks with Geoff Hinton

• Learning handwritten digits– http://www.cs.toronto.edu/~hinton/digits.html

• Modeling real-value data (G.Hinton)– http://www.youtube.com/watch?v=jzMahqXfM7I

http://www.cs.toronto.edu/~hinton/adi/index.htm


http://www.youtube.com/watch?v=jzMahqXfM7I

Deep Learning Architectures

• Consider the problem of trying to classify these hand-written digits.

Deep Learning Architectures

2000 top-level artificial neurons

0500 neurons

(higher level features)

500 neurons(low level features)

Images of digits 0-9

(28 x 28 pixels)

1 2 3 4

5 6 7 8 9

Neural Network:- Trained on 40,000 examples - Learns: * labels / recognize images * generate images from labels- Probabilistic in nature- Demo

2

3

1


Deep Convolution Networks

• Intro - http://www.deeplearning.net/tutorial/lenet.html#lenet

http://www.deeplearning.net/tutorial/lenet.html%23lenet



ML and Computing Power

Andrew Ng’s work on Deep Learning Networks (ICML-2012)• Problem: Learn to recognize human

faces, cats, etc from unlabeled data• Dataset of 10 million images; each

image has 200x200 pixels• 9-layered locally connected neural

network (1B connections)• Parallel algorithm; 1,000 machines

(16,000 cores) for three days

8

Building High-level Features Using Large Scale Unsupervised LearningQuoc V. Le, Marc’Aurelio Ranzato, Rajat Monga, Matthieu Devin, Kai Chen, Greg S. Corrado, Jeffrey Dean, and Andrew Y. NgICML 2012: 29th International Conference on Machine Learning, Edinburgh, Scotland, June, 2012.

ML and Computing Power

Results: • A face detector that is 81.7%

accurate• Robust to translation, scaling,

and rotation

Further results:• 15.8% accuracy in recognizing

20,000 object categories from ImageNet

• 70% relative improvement over the previous state-of-the-art.

9

10

Deep Belief Convolution Networks

• Deep Belief Convolution Network (Javascript)– Runs well under Google Chrome– https://www.jetpac.com/deepbelief

https://www.jetpac.com/deepbelief

Google and DLA

• http://www.youtube.com/watch?v=JBtfRiGEAFI

• http://www.technologyreview.com/news/524026/is-google-cornering-the-market-on-deep-learning/

http://www.youtube.com/watch?v=JBtfRiGEAFI

http://www.youtube.com/watch?v=JBtfRiGEAFI

http://www.technologyreview.com/news/524026/is-google-cornering-the-market-on-deep-learning/



Cloud-Based ML - Google

12

https://developers.google.com/prediction/

https://developers.google.com/prediction/

13

Additional References

• http://deeplearning.net• http://en.wikipedia.org/wiki/Deep_learning • Coursera course – Neural Networks fro Machine

Learning:– https://class.coursera.org/neuralnets-2012-001/lecture

• ML: Hottest Tech Trend in next 3-5 Years– http://www.youtube.com/watch?v=b4zr9Zx5WiE

• Geoff Hinton’s homepage– https://www.cs.toronto.edu/~hinton/

http://en.wikipedia.org/wiki/Deep_learning




https://class.coursera.org/neuralnets-2012-001/lecture



http://www.youtube.com/watch?v=b4zr9Zx5WiE



https://www.cs.toronto.edu/~hinton/

https://www.cs.toronto.edu/~hinton/

Open Questions in ML

Challenges & Open Questions

• Stability-Plasticity problem - How do we integrate new knowledge in with old?

• No loss of new knowledge• No loss or prior knowledge• Efficient methods of storage and recall

• ML methods that can retain learned knowledge will be approaches to “common knowledge” representation – a “Big AI” problem

15


• Practice makes perfect !– An LML system must be capable of learning

from examples of tasks over a lifetime– Practice should increase model accuracy and

overall domain knowledge– How can this be done?

– Research important to AI, Psych, and Education

16


• Scalability– Often a difficult but important challenge– Must scale with increasing:

• Number of inputs and outputs• Number of training examples• Number of tasks• Complexity of tasks, size of hypothesis

representation

– Preferably, linear growth

17

Never-Ending Language Learner

• Carlson et al (2010)• Each day: Extracts information from the

web to populate a growing knowledge base of language semantics

• Learns to perform this task better than on previous day

– Uses a MTL approach in which a large number of different semantic functions are trained together

18

Comp 5013 Deep Learning Architectures

Documents

Transcript of Comp 5013 Deep Learning Architectures