Unit 5013 Assignment Corneliu-George Moisei P04437128_second_submission
Comp 5013 Deep Learning Architectures
description
Transcript of Comp 5013 Deep Learning Architectures
1
Comp 5013Deep Learning Architectures
Daniel L. SilverMarch, 2014
Y. Bengio - McGill
• 2009 Deep Learning Tutorial
• 2013 Deep Learning towards AI
• Deep Learning of Representations (Y. Bengio)– http://www.youtube.com/watch?v=4xsVFLnHC_0
3
Deep Belief RBM Networks with Geoff Hinton
• Learning layers of features by stacking RBMs– http://www.youtube.com/watch?v=VRuQf3DjmfM
• Discriminative fine-tuning in DBN– http://www.youtube.com/watch?v=-I2pgcH02QM
• What happens during fine-tuning?– http://www.youtube.com/watch?v=yxMeeySrfDs
4
Deep Belief RBM Networks with Geoff Hinton
• Learning handwritten digits– http://www.cs.toronto.edu/~hinton/digits.html
• Modeling real-value data (G.Hinton)– http://www.youtube.com/watch?v=jzMahqXfM7I
Deep Learning Architectures
• Consider the problem of trying to classify these hand-written digits.
Deep Learning Architectures
2000 top-level artificial neurons
0500 neurons
(higher level features)
500 neurons(low level features)
Images of digits 0-9
(28 x 28 pixels)
1 2 3 4
5 6 7 8 9
Neural Network:- Trained on 40,000 examples - Learns: * labels / recognize images * generate images from labels- Probabilistic in nature- Demo
2
3
1
Deep Convolution Networks
• Intro - http://www.deeplearning.net/tutorial/lenet.html#lenet
ML and Computing Power
Andrew Ng’s work on Deep Learning Networks (ICML-2012)• Problem: Learn to recognize human
faces, cats, etc from unlabeled data• Dataset of 10 million images; each
image has 200x200 pixels• 9-layered locally connected neural
network (1B connections)• Parallel algorithm; 1,000 machines
(16,000 cores) for three days
8
Building High-level Features Using Large Scale Unsupervised LearningQuoc V. Le, Marc’Aurelio Ranzato, Rajat Monga, Matthieu Devin, Kai Chen, Greg S. Corrado, Jeffrey Dean, and Andrew Y. NgICML 2012: 29th International Conference on Machine Learning, Edinburgh, Scotland, June, 2012.
ML and Computing Power
Results: • A face detector that is 81.7%
accurate• Robust to translation, scaling,
and rotation
Further results:• 15.8% accuracy in recognizing
20,000 object categories from ImageNet
• 70% relative improvement over the previous state-of-the-art.
9
10
Deep Belief Convolution Networks
• Deep Belief Convolution Network (Javascript)– Runs well under Google Chrome– https://www.jetpac.com/deepbelief
Google and DLA
• http://www.youtube.com/watch?v=JBtfRiGEAFI
• http://www.technologyreview.com/news/524026/is-google-cornering-the-market-on-deep-learning/
Cloud-Based ML - Google
12
https://developers.google.com/prediction/
13
Additional References
• http://deeplearning.net• http://en.wikipedia.org/wiki/Deep_learning • Coursera course – Neural Networks fro Machine
Learning:– https://class.coursera.org/neuralnets-2012-001/lecture
• ML: Hottest Tech Trend in next 3-5 Years– http://www.youtube.com/watch?v=b4zr9Zx5WiE
• Geoff Hinton’s homepage– https://www.cs.toronto.edu/~hinton/
Open Questions in ML
Challenges & Open Questions
• Stability-Plasticity problem - How do we integrate new knowledge in with old?
• No loss of new knowledge• No loss or prior knowledge• Efficient methods of storage and recall
• ML methods that can retain learned knowledge will be approaches to “common knowledge” representation – a “Big AI” problem
15
Challenges & Open Questions
• Practice makes perfect !– An LML system must be capable of learning
from examples of tasks over a lifetime– Practice should increase model accuracy and
overall domain knowledge– How can this be done?
– Research important to AI, Psych, and Education
16
Challenges & Open Questions
• Scalability– Often a difficult but important challenge– Must scale with increasing:
• Number of inputs and outputs• Number of training examples• Number of tasks• Complexity of tasks, size of hypothesis
representation
– Preferably, linear growth
17
Never-Ending Language Learner
• Carlson et al (2010)• Each day: Extracts information from the
web to populate a growing knowledge base of language semantics
• Learns to perform this task better than on previous day
– Uses a MTL approach in which a large number of different semantic functions are trained together
18