Spatio-Temporal Sequence Learning of Visual Place Cells for Robotic Navigation

Spatio-Temporal Sequence Learning of Visual Place Cells for Robotic Navigation

presented by Nguyen Vu Anh date: 20th July, 2010

Nguyen Vu Anh, Alex Leng-Phuan Tay, Wooi-Boon Goh

School of Computer EngineeringNanyang Technological University

Singapore

Janusz A. StarzykSchool of Electrical Engineering

Ohio University Athens, USA

IJCNN, WCCI, Barcelona, Spain, 2010

Outline

• Introduction• HMAX Feature Building and Extraction• Spatio-Temporal Learning and Recognition• Empirical Results• Conclusion and future directions

Introduction

• Robotic navigation: Localization and Mapping.– Topological map & Place cells

– Scope: Topological Visual Localization

• Challenges:– High dimension and uncertainty of visual features– Perceptual aliasing – Complex probabilistic frameworks e.g. HMM

• Approach:– Structural organization of human memory architecture.– Short-Term Memory (STM) and Long-Term Memory(LTM) Interaction

Introduction

• System Architecture

Classifier

Sequence

Storage

Symbol

Quantization

Feature

Building

and

Extraction

Introduction

• Existing Works:

– Autonomous navigation (SLAM): Mapping, Localization and Path Planning • Topological vs metric representation• Human employs mainly topological representation of environment [O’Keefe

(1976), Redish(1999), Eichenbaum (1999), etc]

– Visual Place-cell model: [Torralba (2001) ; Renninger&Malik (2004) ; Siagian&Itti (2007)]

• Hierarchical feature building and extraction (HMAX Model) [Serre et al (2007)]

– Spatio-Temporal sequence learning: [Wang&Arbib (1990) (1993), Wang&Yowono (1995)]

• Our previous works: [Starzyk&He, (2007);Starzyk&He (2009);Tay et al (2007);Nguyen&Tay (2009)]

HMAX Feature Building and Extraction

• Interleaving simple (S) and complex (C) layers with increasing spatial invariance (Retina - LGN – V1 – V2,V4)

• 2 Stages:– Feature Construction – Feature Extraction

• Feature Significance:

HMAX Feature Building and Extraction

Prototypes

Ref: Riesenhuber & Poggio (1999), Serre et al (2007)

Spatial Invariance Processing Dot-Product Matching

Spatio-Temporal Learning Architecture

• STM Structure:

– Quantization of input using KFLANN with vigilance ρ

See: Tay, Zurada,Wong and Xu,

TNN, 2007


• STM Structure:

See: Tay, Zurada,Wong and Xu, TNN, 2007


• LTM Cell Structure:

– Each LTM is learnt by one-shot mechanism.

– Each long training sequence is segmented into N overlapping subsequences of the same length M.

– Each subsequence is dedicated permanently to an LTM cell.


• LTM Cell Structure:

Dual Neurons –

STM

Primary Neurons –

Primary Excitation


• Storage– One-shot learning

• Recognition

Input feature vector

Primary Excitation

Computation

Dual Neurons Update – Evidence Accumulation

Output Matching Score from the last DN

Empirical Results

• ICLEF Competition 2010 Dataset– 9 classes of places– 2 sets of images with the same trajectory (Set S and SetC) (~4000

images each set)

C

K

L

O

Empirical Results

• Task– 1 sequence (Set S) as training set and 1 sequence as testing set (Set R).

• Features:– 10% of the training sequence

• Training – ρ=0.7.– Segmentation into consecutive subsequences of equal length (100) with overlapping portion (>50%).– Each subsequence is stored as a LTM cell.– The label of each LTM cell is the majority label of individual components.

• Testing– The label is assigned as the label of the maximally activated LTM cell.

– If the activation of the maximal activated LTM cell is below ө, the system refuses to assign the label.

Empirical Results

Table: LTM listing with training set S

Empirical Results

• Accuracy without threshold

• Accuracy with threshold ө=0.4

• Robust testing: missing elements

Empirical Results

Figure: LTM cells’ activation during recall stage

Empirical Results

• Intersection case:

Conclusion

• A hierarchical spatio-temporal learning architecture

– HMAX hierarchical feature construction and extraction

– STM clustering by KFLANN

– Sequence storage and retrieval by LTM cells.

• Application in appearance-based topological localization

Future Directions

• Automatic tolerance estimation

– E.g. Signal-to-noise ratio figure of features [Liu&Starzyk 2008]

• Hierarchical episodic memory which characterizes the interaction between STM and LTM

– Other embodied intelligence components

– Goal creation system [Starzyk 2008]

• Application in other domains:

– Human Action Recognition

Thank you!

Spatio-Temporal Sequence Learning of Visual Place Cells for Robotic Navigation

Documents

Transcript of Spatio-Temporal Sequence Learning of Visual Place Cells for Robotic Navigation