Action-Perception-Learning Cycles 2012 Fall Graduate Course
description
Transcript of Action-Perception-Learning Cycles 2012 Fall Graduate Course
Action-Perception-Learning Cycles2012 Fall Graduate Course
Byoung-Tak Zhang
Department of Computer Science and Engineering & Cognitive Science and Brain Science Programs
Seoul National University
http://bi.snu.ac.kr/
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
2
What is a Learning System?• Learning is the improvement of performance in some en-
vironment through the acquisition of knowledge resulting from experience in that environment.
the improvementof behavior
on someperformance task through acquisition
of knowledge
based on partial task experience
Activation Function Scaling Function
Output Comparison
Information Propagation
Error Backpropagation
Input x1
Input x2
Input x3
Output
Input Layer Hidden Layer Output Layer
Weights
Activation Function
Machine Learning: An Example
outputsk
kkd otwE 2)(21)(
i
iiii wEwwww
,
x )(xfo
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/ 3
Application Example:Autonomous Land Vehicle (ALV)
• NN learns to steer an autonomous vehicle.• 960 input units, 4 hidden units, 30 output units • Driving at speeds up to 70 miles per hour
ALVINN System
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/ 4
Google “Self-Driving Car”• DARPA Grand Challenge (2005)• DARPA Urban Challenge (2007)• Google Self-Driving Car (2009)
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/ 5
62012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
Machine Learning (ML): Three Tasks
• Supervised Learning– Estimate an unknown mapping from known input and target
output pairs– Learn fw from training set D = {(x,y)} s.t.– Classification: y is discrete– Regression: y is continuous
• Unsupervised Learning– Only input values are provided– Learn fw from D = {(x)} s.t.– Compression– Clustering
• Reinforcement Learning– Not target, but rewards (critiques) are provided “sequentially”– Learn a heuristic function fw from Dt = {(st,at,rt) | t = 1, 2, …}
s.t.– Action selection– Policy learning
)()( xxw fyf
xxw )(f
( , , )t t tf a rw s
Zhang, B.-T., Next-Generation Machine Learning Technologies, Communications of KIISE, 25(3), 2007
72012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
Machine Learning Models• Symbolic Learning
– Version Space Learning– Case-Based Learning
• Neural Learning– Multilayer Perceptrons– Self-Organizing Maps– Support Vector Machines– Kernel Machines
• Evolutionary Learning– Evolution Strategies– Evolutionary Programming– Genetic Algorithms– Genetic Programming– Molecular Programming
Probabilistic Learning¨ Bayesian Networks ¨ Helmholtz Machines ¨ Markov Random Fields¨ Hypernetworks ¨ Latent Variable Models ¨ Generative Topographic
Mapping Other Methods
¨ Decision Trees¨ Reinforcement Learning ¨ Boosting Algorithms¨ Mixture of Experts¨ Independent Component
Analysis
Zhang, B.-T., Next-Generation Machine Learning Technologies, Communications of KIISE, 25(3), 2007
From Machine Learning to Brain-Like Cognitive Learning
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
9
Machine Learning vs. Human Learning
Machine Learning• Clear separation of learning and
inference• Examples are assumed to be sta-
tistically independent• Mainly numerical, quantitative
change• One-shot learning is difficult• Requires uniquely labeled exam-
ples (supervised classification)• Good at discrimination and clas-
sification (discriminative)
Human Learning• Learning and inference inter-
leaved• Previous learning affects the next
learning (dynamic)• Relational, qualitative change
possible• One-shot learning is frequent• Learns from unlabeled or self-la-
beled examples (self-supervised)• Can generate prototypes and in-
stances (generative)
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
10
Human Learning: Properties
• Sensorimotor• Real-time• Predictive• Incremental• Dynamic• Structural• One-shot• Self-supervised• Prototypical• Generative• Recall
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
11
Humans and Computers
Current Computers
What Kind of Computers?
Human Computers
The Entire Problem Space
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
Cognitive Systems
12
Openness
Perception
Action
Cognitive Sys-tem
Cognitive Comput-ing
Real-Time Dynamics
Multisensory Integra-tion
Sequential Generation
Cognitive Systems Require Cognitive Computing or Cognitive Information Processing
Zhang, B.-T., Communications of KIISE, 30(1):75-111, 2012
TU Munich “Rosie” the Cognitive Robot
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/ 13
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
Apple “Siri” Personal Assistant
14
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
Toward Human-Level Computational Intelli-gence:
A Perspective of the SNU Biointelligence Lab• Q1: What capability is fundamentally missing for achieving human-
level computational intelligence?– A1: Human-level machine learning that enables rapid, flexible, and robust decisions
and actions in dynamic and uncertain environments.
• Q2: What aspect is the most essential to study human-level machine learning?– A2: Lifelong learning with perception-action cycles, i.e. the circular flow of informa-
tion that takes place between the organism and its environment in the course of a sen-sory-guided sequence of behavior towards a goal (Fuster, 2004).
• Q3: What capabilities are required for lifelong learning in percep-tion-action cycle systems?– A3: Dynamic, incremental, online, and predictive learning. Flexible representation
and fast reorganization. Multisensory integration, sensorimotor imagery, and sequential decision making. Active, selective attention. Balancing exploration and exploitation. Self-awareness, motivation, self-sustainability….
15
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
16
Course Introduction• From machine learning to brain-like cogni-
tive learning • Brain as a physical, thermodynamic computer • Perception-action cycles and Carnot cycles • Models of action-perception-learning cycles
Brain as a Physical, Thermodynamic Computer
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
18
Brain as a Physical, Thermodynamic Com-puter
• Brain is an open, dissipative system, operating far from thermodynamic equilibrium.
• Brain requires energy and matter to exchange with its environment to maintain stability.
• Brain can be excited internally by chemical (enzymes) and electrical means (action potentials) as well as exter-nally.
• Continuous sensing of external world and internal world. • Continuous action on external world and internal world.
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
19
Mapping the World
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
20
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
21
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
22
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
23
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
24
Carnot Cycle for a Pyramidal Neuron
[Fry, 2005; Fry, 2008]
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
25
Carnot Cycle for the Brain
[Freeman et al., 2012]
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
26
Information Physics of Biological Systems
[Bialek et al., 2007]
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
27[Slide by Robert Fry]
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
28
[Slide by Robert Fry]
Perception-Action Cycles
2012 (c) SNU Biointelli-gence Lab, http://
bi.snu.ac.kr/30
Perception-Action Cycle in Autonomous Helicopter Control
(참고 : Andrew Ng, Stanford Univ.)
Stanford Autonomous Helicopter - Airshow #2: http://www.youtube.com/watch?v=VCdxqn0fcnE
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
31
Perception-Action Cycle in Humans
[Trommershaeuser et al., Sensory Cue Integration, 2011]
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
32
Perception-Action Cycle in Communication between A and B
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
33
Perception-Action Cycle in Language Com-prehension
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
34
Perception-Action Cycle in Robots
[Zahedi et al., Adaptive Behavior, 2009]
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
35
Perception-Action Cycle
[Zahedi et al., Adaptive Behavior, 2009]
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
36
Predictive Information
[Zahedi et al., Adaptive Behavior, 2009]
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
37
Sensory Prediction
[Zahedi et al., Adaptive Behavior, 2009]
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
38
Free Energy and the Perception-Action Cycle
[Friston, Trends in Cognitive Sciences, 2009]
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
39
Reinforcement Learning and the Perception-Action Cycles
[Tishby & Polani, 2010]
= (information-to-go) – (value-to-go)
Brain Mechanisms for the Percep-tion-Action-Learning Cycle
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
41
Brain Computation: Speed, Flexibility, Ro-bustness
How can brain computation be so fast, flexible, and robust in a changing environment?– Fast
• Object recognition: within 100 ms• Anomaly detection: N400, P600• Instant decision-making
– Flexible• Invariant to shift, scale, and rotation• Various utterances for the same meaning• Art, music, literature, and dancing
– Robust• Cluttered image• Noisy speech• Intention reading under complex situations
What brain mechanisms for information processing and organiza-tion allow this?
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
42
Language Processing in the Brain• N400: a brain wave related to linguistic processes.• Increased when semantically mismatched
Fig. 9.30: ERP waveforms differen-tiate between con-gruent words at the end of sentences (work) and anoma-lous last words that do not fit the seman-tic specifications of the preceding con-text (socks).
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
43
Syntactic Processing in the Brain• LAN (left anterior negativity): negative wave over the left
frontal areas when words violate the required word category in a sentence (syntactic violation)
• e.g. “the red eats”, “he mow”
ERPs related to semantic and syntactic processing.
Semantic
Syntactic
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
44
Brain as a Widely Distributed, Parallel, Inter-active, Overlapping, Dynamic Relational Memory Network
[Fuster, 2004][Fuster, 2004]
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
Neural Representations and Processing
• “Chemical” and “molecular” basis of synapses• Distributed representation• Multiple overlapping representations• Hierarchical representation• Associative recall• Population coding• Assembly coding• Sparse coding• Temporal coding• Synfire chain • Dynamic coordination• Correlation coding
45
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
46
Bayesian Brain: Multisensory Integra-tion
[Knill & Pouget, 2004]
Population Coding (Representation)
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/ 47
N
i
fi
Tt
TtT
T
dtttNT
NTtA
1
2/
2/0
0
')'(11lim
N size of populationin spikes ofnumber 1lim)(
Rate Coding
Gain Coding
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
Probabilistic Inference with Population Codes
48[Knill and Pouget, Trends in Neurosciences, 2004]
[Knill and Pouget, Trends in Neurosciences, 2004]
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
Dynamics in Sensory Cue Integration
49[Deneve et al., Nature Neuroscience, 2001, from Knill and Pouget, Trends in Neurosciences, 2004]
Models of Perception-Action-Learning Cycles
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
51
Markov Models (Markov Chains)
First-order Markov Model(Markov Chain)
Second-order Markov Model
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
52
Latent Markov Models (Hidden Markov Models)
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
53
Filtering / Tracking• We want to track the unknown state x of a sys-
tem as it evolves over time based on the (noisy) observations y that arrive sequentially.
yt+1ytyt-1
xt-1 xt xt+1
statep(xt|xt-1)
Observation
Transition
p(yt|xt)
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
54
Linear Dynamical Systems (Kalman Filters)
55
Kalman Filter• Process to be estimated:
yk = Ayk-1 + Buk + wk-1
zk = Hyk + vk
Process Noise (w) with covariance QMeasurement Noise (v) with covariance R
• Kalman FilterPredicted: ŷ-
k is estimate based on measurements at previous time-steps
ŷk = ŷ-k + K(zk - H ŷ-
k )
Corrected: ŷk has additional information – the measurement at time k
K = P-kHT(HP-
kHT + R)-1
ŷ-k = Ayk-1 + Buk
P-k = APk-1AT + Q
Pk = (I - KH)P-k
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
56
Filtering
Discrete x
Continuous x
[Barber et al., 2011]
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
57
SmoothingParallel Smoothing
Sequential Smoothing
Discrete x
Continuous x
[Barber et al., 2011]
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
58
Prediction
Interpolation
Most-likely latent trajectory
[Barber et al., 2011]
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
59
Sequential Importance Sampling
Boostrap filterOptimal choice (minimum variance)
Choosing the proposal distribution:
[Barber et al., 2011]
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
60
Sequential Importance Resamplingor Particle Filter
[Barber et al., 2011]
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
61
Example: PF with N=4
[Barber et al., 2011]
Course Overview
Action-Perception-Learning Cycles
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
63
Course DescriptionHow can the brain learn so fast, flexibly, and robustly? What representational mechanisms and organizational principles does the brain use? How can we apply these principles to constructing intelligent cognitive machines that learn like humans? To address these questions, it is important to observe that the brain is embodied with sensors and actuators, and interacts with its environ-ment in a continuous perception-action cycle. Living in a dynamic environ-ment under uncertainty requires the brain to learn moment by moment in real time and incrementally in this continuous, rapid perception-action cycle. In this course we review recent experimental and theoretical work on perception-action cycles and neural coding principles in the brain. We also study mathe-matical tools developed in information theory, control theory, and Bayesian statistics that may be useful to model the biological information processing in the brain. The goal is to develop computational models of sequential learning processes, i.e. action-perception-learning cycle machines, that enable rapid, continuous, and reliable action and decision-making in a changing environ-ment over an extended period of time or lifelong.
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
64
Plan• Part I: Neurocognitive Models
– Cortical Models– Language Models– Thermodynamic Models– Free Energy Models– Decision-Theoretic Models– Information-Theoretic Models– Exam 1: Thursday, Oct. 18, 2012
• Part II: Computational Models– Markov Models– Dynamical Systems – Kalman Filters– Probabilistic Population Codes– Particle Filters– Exam 2: Thursday, Nov. 29, 2012