Ubiquitous Cognitive Computing: A Vector Symbolic Approach BLERIM EMRULI EISLAB, Luleå University...
-
Upload
allison-walters -
Category
Documents
-
view
220 -
download
1
Transcript of Ubiquitous Cognitive Computing: A Vector Symbolic Approach BLERIM EMRULI EISLAB, Luleå University...
Ubiquitous Cognitive Computing: A Vector Symbolic Approach
BLERIM EMRULIEISLAB, Luleå University of Technology
Outline
Context and motivation Aims Background (concepts and methods) Summary of appended papers Conclusions and future work
Ubiquitous Cognitive Computing: A Vector Symbolic Approach
Ubiquitous Cognitive Computing: A Vector Symbolic Approach
Ubiquitous Cognitive Computing: A Vector Symbolic ApproachConventional computing
1+2/3 = 1.666…7 1010 XOR 1000 = 0010 1-64 bit variables
Cognitive computing Concepts, relations, sequences,
actions, perceptions, learning … Some concepts
man ≅ woman man ≅ lake
Ubiquitous Cognitive Computing: A Vector Symbolic ApproachCognitive computing
Bridging of dissimilar concepts man - fisherman - fish – lake man - plumber - water – lake
Relations between concepts and sequences 5 : 10 : 15 : 20 5 : 10 : 15 : 30
Ubiquitous Cognitive Computing: A Vector Symbolic Approach
Ubiquitous Cognitive Computing: A Vector Symbolic Approach“..invisible, everywhere computing that does not live on a personal device of any sort, but is in the woodwork everywhere (Weiser, 1994).”– Mark Weiser, widely considered to be the father of ubiquitous computing
Ubiquitous Cognitive Computing: A Vector Symbolic ApproachIs cognitive computing for ubiquitous systems, i.e., systems that in principle can appear “everywhere and anywhere” as part of the physical infrastructure that surrounds us.
Ubiquitous Cognitive Computing: A Vector Symbolic Approach
high-level processing
low-level processing(sensory integration)
high-level “symbol-like”representations
Intuition
Aims
Investigate mathematical concepts and develop computational principles with cognitive qualities, which can enable digital systems to function more like brains in terms of:
learning/adaption generalization association prediction …
Other desirable properties computationally lightweight suitable for distributed and
parallel computation robust and degrade gracefully
Related approaches
service-oriented architecture (SOA) traditional artificial intelligence techniques cognitive approach (Giaffreda, 2013; Wu et
al., 2014)
Geometric approach to cognition What can we do with words of 1-kilobyte
or more?
Pentti Kanerva started to explore this idea in the 80’s
Engineering perspective with inspiration from biological neural circuits and human long-term memory
Since the 90’s similar ideas developed also from
Peter Gärdenfors, Professor at Lund University
1 0 1 0 1 0 1 0 1
1 2 3 4 … …. 9996 9997 9998 9999 10000
Sparse Distributed Memory (SDM)
inspired by circuits in the brain model of human long-term memory associative memory
KEY IDEA: Similar or related concepts in memory correspond to nearby points in a high-dimensional space (Kanerva, 1988,
1993)
SDM interpreted as computer memory
SDM interpreted as feedforward neural network
Vector symbolic architectures (VSAs) Concepts and their interrelationships correspond
to points in a high-dimensional space
Able to represent concepts, relations, sequences… learn, generalize, associate… perform analogy-making using vector representations based on sound mathematical concepts and principles (Plate, 1994)
Vector symbolic architectures (VSAs) VSAs were developed to address some early
criticisms of neural networks (Fodor and Pylyshyn, 1988) while retaining useful properties such as learning, generalization, pattern recognition, robustness and noise immunity (30% corruption tolerable)
There are mathematical operators for how to construct operate, query etc. compositional structures, which are part of the VSA framework
Analogy-making
Analogy-making is a central element of cognition that enables animals to identify and manage new information by generalizing past experiences, possibly from a few learned examples
Present theories of analogy-making usually divide this process into three or four stages (Eliasmith and Thagard, 2001)
My work is focused mainly on the challenging mapping stage
Analogical mapping
Analogical mapping is the process of mapping relations and concepts from one situation (a source), x, to another (a target), y; M : x → y
Analogical mapping
The process of mapping relations and concepts that describe one situation (a source) to another (a target)
Analogical mapping (cont’d)
The process of mapping relations and concepts that describe one situation (a source) to another (a target)
Circle is above the square
Analogical mapping (cont’d)
The process of mapping relations and concepts that describe one situation (a source) to another (a target)
Square is below the circle
Analogical mapping (cont’d)
The process of mapping relations and concepts that describe one situation (a source) to another (a target)
Novel ‘‘above–below’’ relations
Generalization via analogical mapping
(Neumann, 2001)
Generalization via analogical mapping
(Neumann, 2001)
Generalization via analogical mapping
(Neumann, 2001)
Generalization via analogical mapping
(Neumann, 2001)
A difficult computational problem If analogical mapping is considered as a graph
comparison problem it is a challenging computational problem
VSAs use compressive representations, not graphs
The ability to encode symbol-like approximate representations makes VSAs computationally feasible and psychologically plausible
Gentner and Forbus (2011) and Eliasmith (2013)
Sum-up
I have adopted a vector-based geometric approach to cognitive computation because it appears to be sufficiently potent and suitable for implementation in resource-constrained devices
A central part of the work deals with analogy making and learning as a key mechanism enabling interoperability between heterogonous systems, much like ontologies play a central role in service-oriented architecture and the semantic web Raad and Evermann (2014): Is Ontology Alignment like
Analogy?
Thesis – Appended papers
A. Emruli, B. and Sandin, F. (2014): Analogical Mapping with Sparse Distributed Memory: A Simple Model that Learns to Generalize from Examples
B. Emruli, B., Gayler, R. W., and Sandin, F. (2013): Analogical Mapping and Inference with Binary Spatter Codes and Sparse Distributed Memory
C. Emruli, B., Sandin, F. and Delsing, J. (2014): Vector Space Architecture for Emergent Interoperability of Systems by Learning from Demonstration
D. Sandin, F., Emruli, B. and Sahlgren M. (2014): Random Indexing of Multi-dimensional Data
Thesis – Cognitive computation papers
A. Emruli, B. and Sandin, F. (2014): Analogical Mapping with Sparse Distributed Memory: A Simple Model that Learns to Generalize from Examples
B. Emruli, B., Gayler, R. W., and Sandin, F. (2013): Analogical Mapping and Inference with Binary Spatter Codes and Sparse Distributed Memory
C. Emruli, B., Sandin, F. and Delsing, J. (2014): Vector Space Architecture for Emergent Interoperability of Systems by Learning from Demonstration
D. Sandin, F., Emruli, B. and Sahlgren M. (2014): Random Indexing of Multi-dimensional Data
Thesis – Cognitive architecture for ubiquitous systems paper
A. Emruli, B. and Sandin, F. (2014): Analogical Mapping with Sparse Distributed Memory: A Simple Model that Learns to Generalize from Examples
B. Emruli, B., Gayler, R. W., and Sandin, F. (2013): Analogical Mapping and Inference with Binary Spatter Codes and Sparse Distributed Memory
C. Emruli, B., Sandin, F. and Delsing, J. (2014): Vector Space Architecture for Emergent Interoperability of Systems by Learning from Demonstration
D. Sandin, F., Emruli, B. and Sahlgren M. (2014): Random Indexing of Multi-dimensional Data
Thesis – Encoding vector representations paper
A. Emruli, B. and Sandin, F. (2014): Analogical Mapping with Sparse Distributed Memory: A Simple Model that Learns to Generalize from Examples
B. Emruli, B., Gayler, R. W., and Sandin, F. (2013): Analogical Mapping and Inference with Binary Spatter Codes and Sparse Distributed Memory
C. Emruli, B., Sandin, F. and Delsing, J. (2014): Vector Space Architecture for Emergent Interoperability of Systems by Learning from Demonstration
D. Sandin, F., Emruli, B. and Sahlgren M. (2014): Random Indexing of Multi-dimensional Data
Emruli B. and Sandin F.
Cognitive Computation6(1):74–88, 2014
Q1: Is it possible to extend the sparse distributed memory model so that it can store multiple mapping examples of compositional structures and make correct analogies from novel inputs?
Paper A
Analogical mapping unit (AMU)
SDM
Results: size of the memory and generalization
Results: size of the memory and generalization
minimum probability of error
Emruli B., Gayler W. R. and Sandin F.
IJCNN 2013, Dallas, TXAug. 4 – 9, 2013
Paper B
Q2: If such an extended sparse distributed memory model is developed, can it learn and infer novel patterns in sequences such as those encountered in widely used intelligence tests like Raven’s Progressive Matrices?
Bidirectionality of mapping vectors
Bidirectionality problem
Raven's Progressive Matrices
Rasmussen R. and Eliasmith C., Topics in Cognitive Science, Vol. 3, No. 1, 2011
Learning mapping vectors
SDM
Learning mapping vectors (cont’d)
SDM
Learning mapping vectors (cont’d)
SDM
Prediction
SDM
Results
Emruli B., Sandin F. and Delsing J.
Biologically Inspired Cognitive Architectures9:33–45, 2014
Q3: Could extended sparse distributed memory and vector-symbolic methodologies such as those considered in Q1 and Q2 be used to address the problem of designing an architecture that enables heterogeneous IoT devices and systems to interoperate autonomously and adapt to instructions in dynamic environments?
Paper C
Communication architecture
No shared operational semantics (Sheth, 1999; Obrst, 2003; Baresi et al., 2013)
Automation system
Learning by demonstration
Interact with the four systems to achieve a particular goal
Instructions of Alice and Bob are the same
Alice Bob
Results
One instruction per day by Alice and Bob
Sandin F., Emruli B. and Sahlgren M.
Knowledge and Information SystemsSubmitted
Paper D
Q4: Is it possible to extend the traditional method of random indexing to handle matrices and higher-order arrays in the form of N-way random indexing, so that more complex data streams and semantic relationships can be analyzed? What are the other implications of this extension?
Random indexing (RI)
Random indexing is (was) an approximative method for dimension reduction and semantic analysis of pairwise relationships
Main properties concepts and their interrelationships correspond to
random points in a high-dimensional space incremental coding/learning lightweight, suitable for processing of streaming data accuracy comparable to standard methods for
dimension reduction
Applications
natural language processing search engines pattern recognition (e.g., event detection in
blogs) graph searching (e.g., social network analysis) other machine learning applications
Results: one-way versus two-way Random Indexing (RI)
Anecdote
“ As an engineer, this can feel like a deal with the devil, as you have to accept error and uncertainty in your results. But the alternative is no results at all! ”
Pete Warden, data scientist and a former Apple engineer
Results: two-way RI versus PCA
Gavagai AB: Opinion mining
Loreen
Danny Saucedo
Thorsten Flinck
Viewer votes
33 %
22 %
8 %
Gavagai forecast
30 %
22 %
12 %
2012
Summary
The proposed AMU integrates the idea of mapping vectors with sparse distributed memory Demonstration of transparent learning and application of
multiple analogical mappings
The AMU solves a particular type of Raven’s matrix The SDM breaks the commutative (bidirectionality) property
of the binary mapping vectors
Summary (cont’d)
Outline of communication architecture that enables system interoperability by learning, without reference to a shared operational semantics Presenting a novel approach to a challenging problem
Extension of Random Indexing (RI) to multiple dimensions in an approximately fixed size representation Comparison of two-RI with the traditional (one-way) RI and PCA
Limitations
Hand-coding of the representations
The examples addressed in Paper C are relatively simple, more complex examples and symbolic representation schemes are needed to further test the architecture Attention mechanism needs to be developed Extension to higher-order Markov chains
In Paper D only one- and two-way RI are investigated and problems considered are relative small in scale and not demonstrated in streaming data
Future work
To apply the architecture outlined in Paper C in a “Living Lab” equipped with technology similar to that described in the hypothetical automation scenario
To improve and further investigate, both empirically and theoretically the implications of the NRI extension
Is the mathematical framework sufficiently general?
“A beloved child has many names.”
Holographic Reduced Representation (HRR) - 1994
Context-Dependent Thinning (CDT) - 2001 Vector Symbolic Architecture (VSA) - 2003 Hyperdimensional Computing (HC) - 2009 Analogical Mapping Unit (AMU) - 2013 Semantic Pointer Architecture (SPAUN) -
2013 Matrix Binding of Additive Terms (MBAT) -
2014
Key readings
Sparse Distributed Memory (Kanerva, 1988)
Conceptual Spaces (Gärdenfors, 2000) Holographic Reduced Representation
(Plate, 2003) Geometry and Meaning (Widdows, 2004) How to Build a Brain (Eliasmith, 2013) The Geometry of Meaning (Gärdenfors,
2014)
Credits
Supervisors
JERKER DELSING FREDRIK SANDIN LENNART GUSTAFSSON
Coauthors
ROSS GAYLER MAGNUS SAHLGREN
Discussions and inspiration
ASAD KHAN PENTTI KANERVA BRUNO OLSHAUSEN CHRIS ELIASMITH
Financial supportSTINT, ARROWHEAD PROJECT, NORDEAS NORRLANDSSTIFTELSE, AND THE
WALLENBERG FOUNDATION
COLLEAGUES, FAMILY AND FRIENDS
THE END
… or perhaps the beginning