Lifelong Machine Learning and Reasoning
description
Transcript of Lifelong Machine Learning and Reasoning
Intelligent Information Technology Research Lab, Acadia University, Canada1
Lifelong Machine Learning and Reasoning
Daniel L. SilverAcadia University,
Wolfville, NS, Canada
Intelligent Information Technology Research Lab, Acadia University, Canada
Talk Outline
Position and Motivation Lifelong Machine Learning Deep Learning Architectures Neural Symbolic Integration Learning to Reason Summary and Recommendations
2
Team Silver
Intelligent Information Technology Research Lab, Acadia University, Canada
Position
It is now appropriate to seriously consider the nature of systems that learn and reason over a lifetime
Advocate a systems approach in the context of an agent that can: Acquire new knowledge through learning Retain and consolidate that knowledge Use it in future learning, reasoning
and other aspects of AI
3
Intelligent Information Technology Research Lab, Acadia University, Canada
Moving Beyond Learning Algorithms - Rationale
1. Strong foundation in prior work
2. Inductive bias is essential to learning (Mitchell, Utgoff 1983; Wolpert 1996)
Learning systems should retain and use prior knowledge as a source for shifting inductive bias
Many real-world problems are non-stationary; have drift
4
Intelligent Information Technology Research Lab, Acadia University, Canada
Moving Beyond Learning Algorithms - Rationale
3. Practical Agents/Robots Require LML Advances in autonomous robotics and intelligent
agents that run on the web or in mobile devices present opportunities for employing LML systems.
The ability to retain and use learned knowledge is very attractive to the researchers designing these systems.
5
Intelligent Information Technology Research Lab, Acadia University, Canada
Moving Beyond Learning Algorithms - Rationale
4. Increasing Capacity of Computers Advances in modern computers provide the
computational power for implementing and testing practical LML systems
IBMs Watson (2011) 90 IBM Power-7 servers Each with four 8-core processors 15 TB (220M text pages) of RAM Tasks divided into thousands of stand-alone
jobs distributed among 80 teraflops (1 trillion ops/sec)
6
Intelligent Information Technology Research Lab, Acadia University, Canada
Moving Beyond Learning Algorithms - Rationale
5. Theoretical advances in AI: ML KR “The acquisition, representation and transfer of
domain knowledge are the key scientific concerns that arise in lifelong learning.” (Thrun 1997)
KR plays an important a role in LML Interaction between knowledge retention & transfer
LML has the potential to make advances on the learning of common background knowledge
Leads to questions about learning to reason
7
Intelligent Information Technology Research Lab, Acadia University, Canada
Lifelong Machine Learning
8
1994 2013
My first biological learning system
Intelligent Information Technology Research Lab, Acadia University, Canada
9
Considers systems that can learn many tasks over a lifetime from one or more domains
Concerned with methods of retaining and using learned knowledge to improve the effectiveness and efficiency of future learning
We investigate systems that must learn: From impoverished training sets For diverse domains of tasks Where practice of the same task happens
Applications: Agents, Robotics, Data Mining, User Modeling
Lifelong Machine Learning
Intelligent Information Technology Research Lab, Acadia University, Canada10
Lifelong Machine Learning Framework
Instance Space X
TrainingExamples
TestingExamples
(xi, y =f(xi))
Model ofClassifier
h
Inductive Learning Systemshort-term memory
Prediction/Action = h(x)
Universal Knowledge
Retention
DomainKnowledge
InductiveBias, BD
KnowledgeSelection
KnowledgeTransfer
S
Intelligent Information Technology Research Lab, Acadia University, Canada
Essential Ingredients of LML
The retention (or consolidation) of learned task knowledge Knowledge Representation perspective Effective and Efficient Retention
Resists the accumulation of erroneous knowledge Maintains or improves model performance Mitigates redundant representation Allows the practice of tasks
11
Intelligent Information Technology Research Lab, Acadia University, Canada
Essential Ingredients of LML
The selective transfer of prior knowledge when learning new tasks Machine Learning perspective More Effective and Efficient Learning
More rapidly produce models That perform better
Selection of appropriate inductive bias to guide search
12
Intelligent Information Technology Research Lab, Acadia University, Canada
Essential Ingredients of LML
A systems approach Ensures the effective and efficient interaction of
the retention and transfer components Much to be learned from the writings of early
cognitive scientists, AI researchers and neuroscientists such as Albus, Holland, Newel, Langley, Johnson-Laird and Minsky
13
Intelligent Information Technology Research Lab, Acadia University, Canada
Overview of LML Work
Supervised Learning Unsupervised Learning Hybrids (semi-supervised, self-taught, co-
training, etc) Reinforcement Learning
Mark Ring, Rich Sutton, Tanaka and Yamamura
14
Intelligent Information Technology Research Lab, Acadia University, Canada
Supervised LML
Michalski (1980s) Constructive inductive learning
Utgoff and Mitchell (1983) Importance of inductive bias to learning - systems should be able
to search for an appropriate inductive bias using prior knowledge
Solomonoff (1989) Incremental learning
Thrun and Mitchell (1990s) Explanation-based neural networks (EBNN) Lifelong Learning
15
Intelligent Information Technology Research Lab, Acadia University, Canada16
x1 xnc1 ck
Task Context Standard Inputs
Long-termConsolidated
Domain Knowledge
Network
f1(c,x)
Short-termLearningNetwork Representational
transfer from CDKfor rapid learning
f’(c,x)
LML via context sensitve csMTL
One outputfor all tasks
Silver, Poirier, Currie (also Tu, Fowler)Inductive transfer with context-sensitive neural networks MMach Learn (2008) 73: 313–336
Task Rehearsal Functional transfer (virtual examples) forslow consolidation
Intelligent Information Technology Research Lab, Acadia University, Canada17
x1 xnc1 ck
Task Context Standard Inputs
Long-termConsolidated
Domain Knowledge
Network
f1(c,x)
Short-termLearningNetwork Representational
transfer from CDKfor rapid learning
f’(c,x)
LML via context sensitve csMTL
One outputfor all tasks
Silver, Poirier, Currie (also Tu, Fowler)Inductive transfer with context-sensitive neural networks MMach Learn (2008) 73: 313–336
Task Rehearsal Functional transfer (virtual examples) forslow consolidation
Intelligent Information Technology Research Lab, Acadia University, Canada18
An Environmental Example
Stream flow rate prediction [Lisa Gaudette, 2006]
x = weather data
f(x) = flow rate
11
12
13
14
15
16
0 1 2 3 4 5 6Years of Data Transfered
MA
E (
m^
3/s)
No Transfer Wilmot Sharpe Sharpe & Wilmot Shubenacadie
Intelligent Information Technology Research Lab, Acadia University, Canada
csMTL and Tasks with Multiple Outputs
Liangliang Tu (2010) Image Morphing:
Inductive transfer between tasks that have multiple outputs
Transforms 30x30 grey scale images using inductive transfer
Three mapping tasks
19
NA NH NS
Intelligent Information Technology Research Lab, Acadia University, Canada
csMTL and Tasks with Multiple Outputs
20
Intelligent Information Technology Research Lab, Acadia University, Canada
csMTL and Tasks with Multiple Outputs
21Demo
Intelligent Information Technology Research Lab, Acadia University, Canada
Two more Morphed Images
Passport Angry Filtered
Passport Sad Filtered
22
Intelligent Information Technology Research Lab, Acadia University, Canada
Unsupervised LML
Deep Learning Architectures
Consider the problem of trying to classify these hand-written digits.
Hinton, G. E., Osindero, S. and Teh, Y. (2006)A fast learning algorithm for deep belief nets.Neural Computation 18, pp 1527-1554.
Layered networks of unsupervised auto-encoders efficiently develop hierarchies of features that capture regularities in their respective inputs
Intelligent Information Technology Research Lab, Acadia University, Canada
Deep Learning Architectures
2000 top-level artificial neurons
0500 neurons
(higher level features)
500 neurons(low level features)
Images of digits 0-9
(28 x 28 pixels)
1 2 3 4
5 6 7 8 9
DLA Neural Network:- Unsupervised training, followed
by back-fitting - 40,000 examples - Learns to: * recognize digits using labels * reconstruct digits given a label- Stochastic in nature
2
3
1
Intelligent Information Technology Research Lab, Acadia University, Canada
Deep Learning Architectures
25Courtesy of http://youqianhaozhe.com/research.htm
Develop common features from unlabelled examples using unsupervised algorithms
Intelligent Information Technology Research Lab, Acadia University, Canada
Deep Learning Architectures
Andrew Ng’s work on Deep Learning Networks (ICML-2012) Problem: Learn to recognize human
faces, cats, etc from unlabeled data Dataset of 10 million images; each
image has 200x200 pixels 9-layered locally connected neural
network (1B connections) Parallel algorithm; 1,000 machines
(16,000 cores) for three days
26
Building High-level Features Using Large Scale Unsupervised LearningQuoc V. Le, Marc’Aurelio Ranzato, Rajat Monga, Matthieu Devin, Kai Chen, Greg S. Corrado, Jeffrey Dean, and Andrew Y. NgICML 2012: 29th International Conference on Machine Learning, Edinburgh, Scotland, June, 2012.
Intelligent Information Technology Research Lab, Acadia University, Canada
Deep Learning Architectures
Results: A face detector that is 81.7%
accurate Robust to translation, scaling,
and rotation
Further results: 15.8% accuracy in recognizing
20,000 object categories from ImageNet
70% relative improvement over the previous state-of-the-art.
27
Intelligent Information Technology Research Lab, Acadia University, Canada
Deep Learning Architectures
Stimulates new ideas about how knowledge of the world is learned, consolidated, and then used for future learning and reasoning
Learning and representation of common background knowledge
Important to Big AI problem solving
28
Intelligent Information Technology Research Lab, Acadia University, Canada
LMLR: Learning to Reason
ML KR … a very interesting area Knowledge consolidation provides insights into how best
to represent common knowledge for use in learning and reasoning
A survey of learning / reasoning paradigms has identified two additional promising bodies of work: NSI - Neural-Symbolic Integration L2R - Learning to Reason
29
Intelligent Information Technology Research Lab, Acadia University, Canada
Neural-Symbolic Integration
Considers hybrid systems that integrate neural networks and symbolic logic
Takes advantage of: Learning capacity of
connectionist networks Transparency and
reasoning capacity of logic
[Garcez09,Lamb08]
30
Intelligent Information Technology Research Lab, Acadia University, Canada
Neural-Symbolic Integration
31
An integrated framework for NSI and LML Adapted from [Bader and Hitzler, 2005]
Intelligent Information Technology Research Lab, Acadia University, Canada
Neural-Symbolic Integration
32
An integrated framework for NSI and LML Adapted from [Bader and Hitzler, 2005]
Intelligent Information Technology Research Lab, Acadia University, Canada
Open Questions
Choice of Machine Learning to Use Which choice of ML works best in the context of
knowledge for reasoning Unsupervised learning taking a more central role Others feel that reinforcement learning is the only
true predictive modeling Hybrid methods are challenge for knowledge
consolidation
33
Intelligent Information Technology Research Lab, Acadia University, Canada
Open Questions
Training Examples versus Prior Knowledge Both NSI and LML systems must weigh the
accuracy and relevance of retained knowledge Theories of how to selectively transfer common
knowledge are needed Measures of relatedness needed
34
Small Nova ScotiaTrout !
Intelligent Information Technology Research Lab, Acadia University, Canada
Open Questions
Effective and Efficient Knowledge Retention Refinement/Consolidation are key to NSI/LML Stability-plasticity: no loss of prior knowledge,
increase accuracy/resolution if possible Approach should allow NSI/LML to efficiently
select knowledge for use Has the potential to make
serious advances on the learning of common background knowledge
35
Intelligent Information Technology Research Lab, Acadia University, Canada
Open Questions
Effective and Efficient Knowledge Transfer Transfer learning should quickly develop accurate
models Model accuracy should never degrade Functional transfer more accurate models
e.g. rehearsal of examples from prior tasks Representational transfer more rapid learning
e.g. priming with weights of prior models
36
Intelligent Information Technology Research Lab, Acadia University, Canada
Open Questions
Practice makes perfect ! An LML system must be capable of learning from
examples of tasks over a lifetime Practice should increase model accuracy and
overall domain knowledge How can this be done?
Research important to AI, Psych, and Education
37
Intelligent Information Technology Research Lab, Acadia University, Canada
Open Questions
Scalability For NSI symbolic extraction is demanding For LML retention and transfer adds complexity Both must scale to large numbers of:
Inputs, outputs Training examples Tasks over a lifetime
Big Data means Big scaling problems
38
Intelligent Information Technology Research Lab, Acadia University, Canada
Learning to Reason (L2R)
Takes a probabilistic perspective on learning and reasoning [Kardon and Roth 97]
Agent need not answer all possible knowledge queries Only those that are relevant to the environment of a
learner in a probably approximately correct (PAC) sense (w.r.t. some prob. dist.) [Valiant 08, Juba 12&13 ]
Assertions can be learned to a desired level of accuracy and confidence using training examples of the assertions
39
Intelligent Information Technology Research Lab, Acadia University, Canada
Learning to Reason (L2R)
We are working on a LMLR approach that uses: Multiple task learning Primed by unsupervised deep learning
PAC-learns multiple logical assertions expressed as binary examples of Boolean functions
Reasoning is done by querying the trained network using similar Boolean examples and looking for sufficient agreement on T/F
Uses a combination of: DLA used to create hierarchies of abstract DNF-like features Consolidation is used to integrate new assertions with prior
knowledge and to share abstract features across a domain knowledge model
40
Intelligent Information Technology Research Lab, Acadia University, Canada
Learning to Reason (L2R) Example: To learn the assertions
(A ∧ B) ∨ C = True, and (A ∨ C) ∧ D = True The L2R system would be provided with examples of
the Boolean functions equivalent to the assertion and subject to a distribution D over the examples : a b c d T a b c d T a b c d T a b c d T 0 0 0 * 0 1 0 0 * 1 0 * 0 0 0 1 * 0 0 0 0 0 1 * 1 1 0 1 * 1 0 * 0 1 0 1 * 0 1 1 0 1 0 * 1 1 1 0 * 1 0 * 1 0 0 1 * 1 0 0 0 1 1 * 1 1 1 1 * 1 0 * 1 1 1 1 * 1 1 1
To query the L2R system with an assertion such as A ∨ ~C = True then examples for this function would be used to
test the system to see if it agreed
41
Intelligent Information Technology Research Lab, Acadia University, Canada
Summary
Propose that the AI community move to systems that are capable of learning, retaining and using knowledge over a lifetime
Opportunities for advances in AI lie at the locus of machine learning and knowledge representation
Consider the acquisition of knowledge in a form that can be used for more general AI, such as Learning to Reason (L2R)
Methods of knowledge consolidation will provide insights into how to best represent common knowledge – fundamental to intelligent systems
42
Intelligent Information Technology Research Lab, Acadia University, Canada
Recommendations
Researchers should Exploit common ground Explore differences
Find low-hanging fruit Encourage pursuit of AI systems that are able
to learn the knowledge that they use for reasoning
43
Make new discoveries
Intelligent Information Technology Research Lab, Acadia University, Canada44
Thank You!
QUESTONS?
[email protected] http://tinyurl/dsilver http://ml3.acadiau.ca