Action-Perception-Learning Cycles 2012 Fall Graduate Course

64
Action-Perception-Learning Cycles 2012 Fall Graduate Course Byoung-Tak Zhang Department of Computer Science and Engineering & Cognitive Science and Brain Science Programs Seoul National University http://bi.snu.ac.kr/

description

Action-Perception-Learning Cycles 2012 Fall Graduate Course. Byoung-Tak Zhang Department of Computer Science and Engineering & Cognitive Science and Brain Science Programs Seoul National University http://bi.snu.ac.kr/. What i s a Learning System?. - PowerPoint PPT Presentation

Transcript of Action-Perception-Learning Cycles 2012 Fall Graduate Course

Page 1: Action-Perception-Learning Cycles 2012 Fall Graduate Course

Action-Perception-Learning Cycles2012 Fall Graduate Course

Byoung-Tak Zhang

Department of Computer Science and Engineering & Cognitive Science and Brain Science Programs

Seoul National University

http://bi.snu.ac.kr/

Page 2: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

2

What is a Learning System?• Learning is the improvement of performance in some en-

vironment through the acquisition of knowledge resulting from experience in that environment.

the improvementof behavior

on someperformance task through acquisition

of knowledge

based on partial task experience

Page 3: Action-Perception-Learning Cycles 2012 Fall Graduate Course

Activation Function Scaling Function

Output Comparison

Information Propagation

Error Backpropagation

Input x1

Input x2

Input x3

Output

Input Layer Hidden Layer Output Layer

Weights

Activation Function

Machine Learning: An Example

outputsk

kkd otwE 2)(21)(

i

iiii wEwwww

,

x )(xfo

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/ 3

Page 4: Action-Perception-Learning Cycles 2012 Fall Graduate Course

Application Example:Autonomous Land Vehicle (ALV)

• NN learns to steer an autonomous vehicle.• 960 input units, 4 hidden units, 30 output units • Driving at speeds up to 70 miles per hour

ALVINN System

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/ 4

Page 5: Action-Perception-Learning Cycles 2012 Fall Graduate Course

Google “Self-Driving Car”• DARPA Grand Challenge (2005)• DARPA Urban Challenge (2007)• Google Self-Driving Car (2009)

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/ 5

Page 6: Action-Perception-Learning Cycles 2012 Fall Graduate Course

62012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

Machine Learning (ML): Three Tasks

• Supervised Learning– Estimate an unknown mapping from known input and target

output pairs– Learn fw from training set D = {(x,y)} s.t.– Classification: y is discrete– Regression: y is continuous

• Unsupervised Learning– Only input values are provided– Learn fw from D = {(x)} s.t.– Compression– Clustering

• Reinforcement Learning– Not target, but rewards (critiques) are provided “sequentially”– Learn a heuristic function fw from Dt = {(st,at,rt) | t = 1, 2, …}

s.t.– Action selection– Policy learning

)()( xxw fyf

xxw )(f

( , , )t t tf a rw s

Zhang, B.-T., Next-Generation Machine Learning Technologies, Communications of KIISE, 25(3), 2007

Page 7: Action-Perception-Learning Cycles 2012 Fall Graduate Course

72012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

Machine Learning Models• Symbolic Learning

– Version Space Learning– Case-Based Learning

• Neural Learning– Multilayer Perceptrons– Self-Organizing Maps– Support Vector Machines– Kernel Machines

• Evolutionary Learning– Evolution Strategies– Evolutionary Programming– Genetic Algorithms– Genetic Programming– Molecular Programming

Probabilistic Learning¨ Bayesian Networks ¨ Helmholtz Machines ¨ Markov Random Fields¨ Hypernetworks ¨ Latent Variable Models ¨ Generative Topographic

Mapping Other Methods

¨ Decision Trees¨ Reinforcement Learning ¨ Boosting Algorithms¨ Mixture of Experts¨ Independent Component

Analysis

Zhang, B.-T., Next-Generation Machine Learning Technologies, Communications of KIISE, 25(3), 2007

Page 8: Action-Perception-Learning Cycles 2012 Fall Graduate Course

From Machine Learning to Brain-Like Cognitive Learning

Page 9: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

9

Machine Learning vs. Human Learning

Machine Learning• Clear separation of learning and

inference• Examples are assumed to be sta-

tistically independent• Mainly numerical, quantitative

change• One-shot learning is difficult• Requires uniquely labeled exam-

ples (supervised classification)• Good at discrimination and clas-

sification (discriminative)

Human Learning• Learning and inference inter-

leaved• Previous learning affects the next

learning (dynamic)• Relational, qualitative change

possible• One-shot learning is frequent• Learns from unlabeled or self-la-

beled examples (self-supervised)• Can generate prototypes and in-

stances (generative)

Page 10: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

10

Human Learning: Properties

• Sensorimotor• Real-time• Predictive• Incremental• Dynamic• Structural• One-shot• Self-supervised• Prototypical• Generative• Recall

Page 11: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

11

Humans and Computers

Current Computers

What Kind of Computers?

Human Computers

The Entire Problem Space

Page 12: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

Cognitive Systems

12

Openness

Perception

Action

Cognitive Sys-tem

Cognitive Comput-ing

Real-Time Dynamics

Multisensory Integra-tion

Sequential Generation

Cognitive Systems Require Cognitive Computing or Cognitive Information Processing

Zhang, B.-T., Communications of KIISE, 30(1):75-111, 2012

Page 13: Action-Perception-Learning Cycles 2012 Fall Graduate Course

TU Munich “Rosie” the Cognitive Robot

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/ 13

Page 14: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

Apple “Siri” Personal Assistant

14

Page 15: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

Toward Human-Level Computational Intelli-gence:

A Perspective of the SNU Biointelligence Lab• Q1: What capability is fundamentally missing for achieving human-

level computational intelligence?– A1: Human-level machine learning that enables rapid, flexible, and robust decisions

and actions in dynamic and uncertain environments.

• Q2: What aspect is the most essential to study human-level machine learning?– A2: Lifelong learning with perception-action cycles, i.e. the circular flow of informa-

tion that takes place between the organism and its environment in the course of a sen-sory-guided sequence of behavior towards a goal (Fuster, 2004).

• Q3: What capabilities are required for lifelong learning in percep-tion-action cycle systems?– A3: Dynamic, incremental, online, and predictive learning. Flexible representation

and fast reorganization. Multisensory integration, sensorimotor imagery, and sequential decision making. Active, selective attention. Balancing exploration and exploitation. Self-awareness, motivation, self-sustainability….

15

Page 16: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

16

Course Introduction• From machine learning to brain-like cogni-

tive learning • Brain as a physical, thermodynamic computer • Perception-action cycles and Carnot cycles • Models of action-perception-learning cycles

Page 17: Action-Perception-Learning Cycles 2012 Fall Graduate Course

Brain as a Physical, Thermodynamic Computer

Page 18: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

18

Brain as a Physical, Thermodynamic Com-puter

• Brain is an open, dissipative system, operating far from thermodynamic equilibrium.

• Brain requires energy and matter to exchange with its environment to maintain stability.

• Brain can be excited internally by chemical (enzymes) and electrical means (action potentials) as well as exter-nally.

• Continuous sensing of external world and internal world. • Continuous action on external world and internal world.

Page 19: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

19

Mapping the World

Page 20: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

20

Page 21: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

21

Page 22: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

22

Page 23: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

23

Page 24: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

24

Carnot Cycle for a Pyramidal Neuron

[Fry, 2005; Fry, 2008]

Page 25: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

25

Carnot Cycle for the Brain

[Freeman et al., 2012]

Page 26: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

26

Information Physics of Biological Systems

[Bialek et al., 2007]

Page 27: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

27[Slide by Robert Fry]

Page 28: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

28

[Slide by Robert Fry]

Page 29: Action-Perception-Learning Cycles 2012 Fall Graduate Course

Perception-Action Cycles

Page 30: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelli-gence Lab, http://

bi.snu.ac.kr/30

Perception-Action Cycle in Autonomous Helicopter Control

(참고 : Andrew Ng, Stanford Univ.)

Stanford Autonomous Helicopter - Airshow #2: http://www.youtube.com/watch?v=VCdxqn0fcnE

Page 31: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

31

Perception-Action Cycle in Humans

[Trommershaeuser et al., Sensory Cue Integration, 2011]

Page 32: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

32

Perception-Action Cycle in Communication between A and B

Page 33: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

33

Perception-Action Cycle in Language Com-prehension

Page 34: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

34

Perception-Action Cycle in Robots

[Zahedi et al., Adaptive Behavior, 2009]

Page 35: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

35

Perception-Action Cycle

[Zahedi et al., Adaptive Behavior, 2009]

Page 36: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

36

Predictive Information

[Zahedi et al., Adaptive Behavior, 2009]

Page 37: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

37

Sensory Prediction

[Zahedi et al., Adaptive Behavior, 2009]

Page 38: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

38

Free Energy and the Perception-Action Cycle

[Friston, Trends in Cognitive Sciences, 2009]

Page 39: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

39

Reinforcement Learning and the Perception-Action Cycles

[Tishby & Polani, 2010]

= (information-to-go) – (value-to-go)

Page 40: Action-Perception-Learning Cycles 2012 Fall Graduate Course

Brain Mechanisms for the Percep-tion-Action-Learning Cycle

Page 41: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

41

Brain Computation: Speed, Flexibility, Ro-bustness

How can brain computation be so fast, flexible, and robust in a changing environment?– Fast

• Object recognition: within 100 ms• Anomaly detection: N400, P600• Instant decision-making

– Flexible• Invariant to shift, scale, and rotation• Various utterances for the same meaning• Art, music, literature, and dancing

– Robust• Cluttered image• Noisy speech• Intention reading under complex situations

What brain mechanisms for information processing and organiza-tion allow this?

Page 42: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

42

Language Processing in the Brain• N400: a brain wave related to linguistic processes.• Increased when semantically mismatched

Fig. 9.30: ERP waveforms differen-tiate between con-gruent words at the end of sentences (work) and anoma-lous last words that do not fit the seman-tic specifications of the preceding con-text (socks).

Page 43: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

43

Syntactic Processing in the Brain• LAN (left anterior negativity): negative wave over the left

frontal areas when words violate the required word category in a sentence (syntactic violation)

• e.g. “the red eats”, “he mow”

ERPs related to semantic and syntactic processing.

Semantic

Syntactic

Page 44: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

44

Brain as a Widely Distributed, Parallel, Inter-active, Overlapping, Dynamic Relational Memory Network

[Fuster, 2004][Fuster, 2004]

Page 45: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

Neural Representations and Processing

• “Chemical” and “molecular” basis of synapses• Distributed representation• Multiple overlapping representations• Hierarchical representation• Associative recall• Population coding• Assembly coding• Sparse coding• Temporal coding• Synfire chain • Dynamic coordination• Correlation coding

45

Page 46: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

46

Bayesian Brain: Multisensory Integra-tion

[Knill & Pouget, 2004]

Page 47: Action-Perception-Learning Cycles 2012 Fall Graduate Course

Population Coding (Representation)

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/ 47

N

i

fi

Tt

TtT

T

dtttNT

NTtA

1

2/

2/0

0

')'(11lim

N size of populationin spikes ofnumber 1lim)(

Rate Coding

Gain Coding

Page 48: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

Probabilistic Inference with Population Codes

48[Knill and Pouget, Trends in Neurosciences, 2004]

[Knill and Pouget, Trends in Neurosciences, 2004]

Page 49: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

Dynamics in Sensory Cue Integration

49[Deneve et al., Nature Neuroscience, 2001, from Knill and Pouget, Trends in Neurosciences, 2004]

Page 50: Action-Perception-Learning Cycles 2012 Fall Graduate Course

Models of Perception-Action-Learning Cycles

Page 51: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

51

Markov Models (Markov Chains)

First-order Markov Model(Markov Chain)

Second-order Markov Model

Page 52: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

52

Latent Markov Models (Hidden Markov Models)

Page 53: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

53

Filtering / Tracking• We want to track the unknown state x of a sys-

tem as it evolves over time based on the (noisy) observations y that arrive sequentially.

yt+1ytyt-1

xt-1 xt xt+1

statep(xt|xt-1)

Observation

Transition

p(yt|xt)

Page 54: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

54

Linear Dynamical Systems (Kalman Filters)

Page 55: Action-Perception-Learning Cycles 2012 Fall Graduate Course

55

Kalman Filter• Process to be estimated:

yk = Ayk-1 + Buk + wk-1

zk = Hyk + vk

Process Noise (w) with covariance QMeasurement Noise (v) with covariance R

• Kalman FilterPredicted: ŷ-

k is estimate based on measurements at previous time-steps

ŷk = ŷ-k + K(zk - H ŷ-

k )

Corrected: ŷk has additional information – the measurement at time k

K = P-kHT(HP-

kHT + R)-1

ŷ-k = Ayk-1 + Buk

P-k = APk-1AT + Q

Pk = (I - KH)P-k

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

Page 56: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

56

Filtering

Discrete x

Continuous x

[Barber et al., 2011]

Page 57: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

57

SmoothingParallel Smoothing

Sequential Smoothing

Discrete x

Continuous x

[Barber et al., 2011]

Page 58: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

58

Prediction

Interpolation

Most-likely latent trajectory

[Barber et al., 2011]

Page 59: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

59

Sequential Importance Sampling

Boostrap filterOptimal choice (minimum variance)

Choosing the proposal distribution:

[Barber et al., 2011]

Page 60: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

60

Sequential Importance Resamplingor Particle Filter

[Barber et al., 2011]

Page 61: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

61

Example: PF with N=4

[Barber et al., 2011]

Page 62: Action-Perception-Learning Cycles 2012 Fall Graduate Course

Course Overview

Action-Perception-Learning Cycles

Page 63: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

63

Course DescriptionHow can the brain learn so fast, flexibly, and robustly? What representational mechanisms and organizational principles does the brain use? How can we apply these principles to constructing intelligent cognitive machines that learn like humans? To address these questions, it is important to observe that the brain is embodied with sensors and actuators, and interacts with its environ-ment in a continuous perception-action cycle. Living in a dynamic environ-ment under uncertainty requires the brain to learn moment by moment in real time and incrementally in this continuous, rapid perception-action cycle. In this course we review recent experimental and theoretical work on perception-action cycles and neural coding principles in the brain. We also study mathe-matical tools developed in information theory, control theory, and Bayesian statistics that may be useful to model the biological information processing in the brain. The goal is to develop computational models of sequential learning processes, i.e. action-perception-learning cycle machines, that enable rapid, continuous, and reliable action and decision-making in a changing environ-ment over an extended period of time or lifelong.

Page 64: Action-Perception-Learning Cycles 2012 Fall Graduate Course

2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/

64

Plan• Part I: Neurocognitive Models

– Cortical Models– Language Models– Thermodynamic Models– Free Energy Models– Decision-Theoretic Models– Information-Theoretic Models– Exam 1: Thursday, Oct. 18, 2012

• Part II: Computational Models– Markov Models– Dynamical Systems – Kalman Filters– Probabilistic Population Codes– Particle Filters– Exam 2: Thursday, Nov. 29, 2012