The Neuro-Symbolic Code of Perception*...

20
The Neuro-Symbolic Code of Perception* 1 Rosemarie Velik Vienna University of Technology, Austria Tecnalia, Spain [email protected] Perceptual research has so far been tackled on two different levels: the neural and the cognitive (symbolic) level. These two levels have so far mainly been investigated separately from each other and little is known about their correla- tions. Here, we present a coding scheme (neuro-symbolic coding) and a cogni- tive architecture (a neuro-symbolic network), which make it possible to unify these two levels. Based on this, hypotheses are presented for how the activation of millions of sensory receptors results in a unified, complex, multimodal per- ception, what the function of feedbacks in perception is, what influence focus of attention and knowledge have, and how binding is solved in perception. Furthermore, a hypothesis for the mechanisms involved in perceptual learning is proposed. Keywords: perception model, neuro-symbolic network, knowledge, focus of attention, multimodal perception, binding problem, learning 1. Introduction Perception is defined as the process of acquiring, selecting, organiz- ing, and interpreting sensory information [22]. Decades of research have attempted to reveal the principles lying behind this process. D. Hubel and T. Wiesel greatly expanded the scientific knowledge of sensory processing in the primary visual cortex [15]. Based on experiments in the cat’s visual cortex, they suggested a model for visual information processing with a Journal of Cognitive Science 11: 161-180, 2010 ©2010 Institute for Cognitive Science, Seoul National University *I would like to thank the excellent anonymous reviewers who helped with their feedback to significantly improve this article.

Transcript of The Neuro-Symbolic Code of Perception*...

The Neuro-Symbolic Code of Perception*1

Rosemarie Velik

Vienna University of Technology, AustriaTecnalia, Spain

[email protected]

Perceptual research has so far been tackled on two different levels: the neural and the cognitive (symbolic) level. These two levels have so far mainly been investigated separately from each other and little is known about their correla-tions. Here, we present a coding scheme (neuro-symbolic coding) and a cogni-tive architecture (a neuro-symbolic network), which make it possible to unify these two levels. Based on this, hypotheses are presented for how the activation of millions of sensory receptors results in a unified, complex, multimodal per-ception, what the function of feedbacks in perception is, what influence focus of attention and knowledge have, and how binding is solved in perception. Furthermore, a hypothesis for the mechanisms involved in perceptual learning is proposed.

Keywords: perception model, neuro-symbolic network, knowledge, focus of attention, multimodal perception, binding problem, learning

1. Introduction

Perception is defined as the process of acquiring, selecting, organiz-ing, and interpreting sensory information [22]. Decades of research have attempted to reveal the principles lying behind this process. D. Hubel and T. Wiesel greatly expanded the scientific knowledge of sensory processing in the primary visual cortex [15]. Based on experiments in the cat’s visual cortex, they suggested a model for visual information processing with a

Journal of Cognitive Science 11: 161-180, 2010©2010 Institute for Cognitive Science, Seoul National University

*I would like to thank the excellent anonymous reviewers who helped with their feedback to significantly improve this article.

162 Rosemarie Velik

‘simple-to-complex hierarchy’ — a feed forward sequence of more and more complex and invariant neuronal representations. A. Luria described the afferent system, i.e. the perceptual system, to be organized in differ-ent hierarchical levels, namely the primary, secondary, and tertiary cortex, with different functions [18]. Perception has for a long time been seen as a passive, stimulus-driven process. These classical theories have focused on serial bottom-up processing in hierarchically organized neural architectures [15,18]. However, more recent approaches view perception as an active, highly selective process [5]. They emphasize on the role of top-down pro-cesses in perception. One such top-down process is knowledge and expecta-tion [6,43]. Another top-down process is focus of attention [3,26].

Despite all these research findings, a global understanding of the pro-cesses involved in perception has so far not been achieved. The question that still remains to be answered is how information from millions of sen-sory receptors, processed by an even larger number of nerve cells, results in a unified, multimodal perception of our world. The significance of this challenge is particularly clearly expressed by research concerned with the so-called binding problem — a problem that is one of today’s key questions about brain function and which has already puzzled researchers for decades [10]. Various solutions have been suggested to the binding problem, namely combination coding [1,15], population coding [20,21], temporal coding [42], binding by attention [26], binding by knowledge, expectation, and memory [6], hardwired versus on-demand binding [13], bundling and binding of features [43], the feature-integration theory of attention [26], and syn-chronization through top-down processes [5]. Nevertheless, none of these hypotheses has so far led to a satisfactory answer. Up to today, a framework is missing in order to put all the available findings together and to get a comprehensible picture of the function of the whole perceptual system. In this article, we suggest such a framework and introduce a neuro-symbolic coding scheme that is applicable to all levels of perception.

The results we present come from an interdisciplinary research project involving the disciplines of artificial intelligence, computer engineer-ing, neuroscience, and neuro-psychology. The original aim of this project was to use insights from neuroscience and neuro-psychology in order to develop more efficient and ‘intelligent’ machine perception systems

163The Neuro-Symbolic Code of Perception

[4,17,29,32,35,38,39]. During the course of the project, it turned out that the knowledge accumulated through these joint research efforts was not just highly valuable for the engineering discipline but also for the discipline of brain science [36, 37]. By building an implementable and functioning machine perception system with the human brain as archetype, many new insights about the coherences and relationships of different brain theories and hypotheses were gained and dark spots and inconsistencies in brain theories were identified. The result is a functional model of the perceptual system of the human brain that constitutes a considerable step forward towards an explanation for how perception works, from the level of sensory receptors up to the level of complex, multimodal perception.

2. Basic Perceptual Information Processing Units — The Right Level of Abstraction

When attempting to build a model of the brain, it is important to decide about the level of abstraction from which to start. This decision actually depends on the particular purpose of the model. Starting at a too low level, e.g., with the chemical processes in the nerve cells, poses the problem that one can get stuck in details that might not be necessary for the global understanding of the problem. Beginning at a too high level, e.g., with whole functional systems and their interaction, might turn out to be too vague to fully understand the given problem.

For the aim described in this article, which is to develop a model for per-ception, a new level of abstraction is introduced with so-called neuro-sym-bols as basic information processing units. This abstraction level arose from our observation that information in the perceptual system of the brain is processed by neurons but that the mental correlate of neural firing patterns is symbolic information in form of perceptual images (e.g., a face, a person, a melody, a voice, etc.) [30]. In the brain, neurons and neuron groups have been found that react, e.g., exclusively to the perception of a line, an edge, a colour, a face, a sound of a certain frequency, or a melody [9,15,20]. This implies that certain neurons or groups of neurons in the brain are responsi-ble for the coding of certain symbolic information. Neuro-symbols consider exactly this fact and allow a comprehensible and interpretable representa-

164 Rosemarie Velik

tion of the information processing and information flow in perception. Neuro-symbols represent perceptual images — symbolic information —

and show a number of analogies to the function principle of neurons. In figure 1, the function principle of neuro-symbols and a concrete example are illustrated. Each neuro-symbol has an activation degree with a value between 0 and 1 that indicates the degree (probability) with which the per-ceptual image it represents is present in the environment. Neuro-symbols have a number of inputs and one output. Via the inputs, information about the activation of certain other neuro-symbols representing other images is received, together with the so-called properties of these neuro-symbols (see next paragraph). These activation degrees are then summed up and normal-ized according to the number of excitatory inputs. If this sum exceeds a cer-tain threshold value, the neuro-symbol is activated and as a consequence the information about its activation in form of the value 0 or 1 is transmitted to other neuro-symbols via its output.1

Inputs of neuro-symbols can also be weighted differently in order to consider different reliabilities of incoming information and to have an inhibitory effect. Neuro-symbols cannot only process information received concurrently but also information coming in asynchronously, within a cer-tain time window or in a certain temporal succession. This makes it pos-sible to handle dynamic perceptual information and sequences of events — an issure which has been addressed before in [16, 25, 32]. Furthermore, neuro-symbols can carry so-called properties, which specify the perceptual image in more detail. Each property can have a range of different values. One important property is the location property that indicates where in the environment a perceptual image is perceived. The usage of properties reflects the principle of population coding [21] according to which related perceptual images are not always represented by separate neurons, but by a group of neurons. Also for the mentioned coding of temporal succession of incoming signals, generally a group of neurons rather than a single neuron is necessary.

1 For a discussion of the use of activation functions different from the threshold function see [31].

165The Neuro-Symbolic Code of Perception

3. Structural Organization and Information Flow

In order to process information of a certain complexity, neurons in the brain combine to neural networks and exchange information. The right structural organization of these networks is of crucial importance for cor-rect and efficient functioning. This also holds true for the neuro-symbolic representation. Following descriptions of D. Hubel and T. Wiesel [15] and A. Luria [18], according to which the visual cortex and the perceptual cortex as a whole are structured in a modular hierarchical way, we instroduced a

Figure 1. Function principle of a neuro-symbol. A neuro-symbol represents a per-ceptual image. Incoming information is information about the activation degree and the value of properties of other neuro-symbols or the activation of sensory recep-tors. The different activation degrees are weighted, summed up, and normalized. If this sum exceeds a certain threshold, the neuro-symbol is activated and its activa-tion degree is transmitted further via the output. The concrete, slightly simplified example on the right shows how the activation of simple perceptual features like lines of certain orientations and a circle can activate a neuro-symbol representing a person.

166 Rosemarie Velik

structural arrangement of neuro-symbols to a so-called neuro-symbolic net-work as depicted in figure 2.

Starting point for information processing are input data from sensory receptors. The information coming from these receptors is then processed by neuro-symbols in three stages according to the characteristics and func-tions of the primary, secondary, and tertiary cortex of the perceptual system. These stages are labelled in the following as neuro-symbolic feature level, unimodal level, and multimodal level. In the first two stages, information from each modality is processed separately and in parallel. In the third stage, neuro-symbolic information from all modalities is merged and results in a unitary multimodal perception of the environment. Each of the levels can in fact consist of a number of neuro-symbolic sub-layers. Furthermore, a sensory modality can consist of a number of sub-modalities like for example the somatosensory system, which consists of the tactile sense, the pain sense, the temperature sense, etc.

In the first stage, neuro-symbols represent simple features like edges, lines, colours, movements of a certain velocity and into a certain direction

Figure 2. Modular hierarchical structure of a neuro-symbolic network. Starting point for information processing are data from sensory receptors, which are com-bined to more and more complex perceptual information. In the first two levels, information of each sensor modality is processed separately and in parallel. In the third level, information from all modalities is merged and results in a multimodal perception of the environment. Besides forward connections (solid lines) from lower to higher levels, also feedbacks (dotted lines) and top-down connections (not depicted) are crucial for an unambiguous and robust perception.

167The Neuro-Symbolic Code of Perception

for the visual modality or sounds of a certain frequency for the auditory modality. Neuro-symbols are arranged topographically, which means that spatially neighbouring sensory receptors of a modality project their infor-mation on neighbouring neuro-symbols. Neuro-symbols respond to certain features at a certain location. The higher the sub-layer of this stage, the more complex are the features the neuro-symbols respond to and the bigger is the spatial area they receive information from.

In the second stage, a combination of extracted features of one modality results in a complex perception of all aspects of this particular modality. E.g., for the visual system, perceptual images like faces, a person, or other objects would be perceived. At this level, neuro-symbols respond to percep-tual images unimportant where in the environment they are perceived. The location is only coded additionally as a property of the neuro-symbols.

Figure 3. Example to illustrate the function of feedbacks. Without feedbacks, the four lower-level neuro-symbols representing a person would activate both the neu-ro-symbol ‘person’ and ‘cross.’ This is due to the fact that the neuro-symbol ‘cross’ is activated by a subset of the neuro-symbols that activate the neuro-symbol ‘per-son.’ To inhibit the undesired activation of the neuro-symbol ‘cross,’ there exists an inhibitory feedback connection from the neuro-symbol ‘person’ to the neuro-symbol ‘cross.’ As soon as the neuro-symbol ‘person’ is activated, the concurrent activation of the neuro-symbol ‘cross’ is inhibited by decreasing its activation degree in a way that it falls below the activation threshold.

168 Rosemarie Velik

Finally, on the highest level, neuro-symbols receive input from all sensory modalities. Information of all unimodal neuro-symbols is combined and merged to multimodal neuro-symbols. In higher sub-layers, there can also be merged information coming from different multimodal neuro-symbols.

According to D. Hubel and T. Wiesel [15], the visual cortex has a simple to complex forward structure. This means that information from a lower level responsible for the calculation of simple features is always passed to the next higher level responsible for the processing of more complex fea-tures (solid lines in figure 2). Nevertheless, the cortex in fact comprises a large number of feedback connections. So far, it was not clearly understood what the use of these feedbacks could be. One main purpose what feed-backs can be good for, we illustrate in figure 3. We propose that via feed-backs, the activation of neuro-symbols can be inhibited in the case that cer-tain lower-level neuro-symbols activate concurrently more than one neuro-symbol of a higher level. This happens if one higher-level neuro-symbol is connected to a sub-group of lower-level neuro-symbols that make up another higher-level neuro-symbol. Another function of feedbacks seems to be the transfer of activations with a temporal delay, which will however not be considered further in this article.

4. Pre-wiring versus Learning

The essential information of neural and neuro-symbolic networks does not lie in the basic information processing units (neurons, neuro-symbols) themselves but in their connections. So far, brain researches are not in agreement about how these connections are formed. According to A. Luria [18], higher cortical levels can only develop after lower levels have already evolved. This indicates that at least connections in the secondary and ter-tiary cortex are learned by experience. This seems quite logic, as we cannot know from birth on about the physical appearance of particular complex perceptual images. In these higher levels, learning may consist of a combi-nation of supervised and unsupervised learning processes. The question that is even more controversial is when and how connections in the primary cor-tex, which responds only to simple features, are formed. Here, two disparate theories exist [14]. According to the first, the primary cortex is pre-wired at

169The Neuro-Symbolic Code of Perception

birth (somehow guided by the genetic code). The second theory is that also connections in the primary cortex are formed by experience. This process might also already start before birth [23].

We support the second theory and suggest that there exists a number of learning principles that are the same in the whole brain. Identifying these rules of learning will have an invaluable impact on brain understanding and also the design of self-learning artificial intelligent systems.

We hypnotize the following three factors to form the basis for the learn-ing process, according to which learning basically depends on the geometri-cal relations of neurons (or, in our particular level of abstraction, neuro-symbols), the time duration of axon and dendrite growing, and Hebb’s law of correlations in activations. These rules are still subject to refinement and further verification, but give a starting point for understanding how learn-ing in the brain might take place:

1. When a neuron begins to grow, one or even more branches of the neuron start to grow in random directions.

2. When a branch touches another neuron, it connects. Connections can be either excitatory or inhibitory, whereby excitatory connections are more likely at the beginning of neural growth. Neurons that are spa-tially close are more likely to connect first because it takes less time until the length of the branches is sufficient for attaching to another neuron. This is particularly well understandable when considering the primary cortex where neighbouring sensory receptors and neu-rons of one layer project on neighbouring neurons on the next layer [9].

3. The formed connections remain stable for a certain time so that also other neurons can connect. Connections that fire at the same time are strengthened. Connections that fire asynchronously to the majority are weakened again until they disappear. This is in accordance with the Hebbian rule ‘what fires together, wires together and’ the ‘use it or loose it’ principle [11, 12]. Furthermore, recent research confirms the influence of synaptic activity on neurons’ shape development [7, 19].

170 Rosemarie Velik

5. Top-down Processes

Starting point for perception is generally information coming from sen-sory receptors. However, this information is not always sufficient for an unambiguous recognition of the environment. What we already know about the world and what we expect to perceive crucially influences our percep-tion [24, 43]. Knowledge and expectation are considered as top-down pro-cesses with an information flow from higher cortical levels to lower ones.

Figure 4. Example to illustrate the interaction of knowledge and expectation with perception. From the visual unimodal neuro-symbol ‘persons round table’ and the acoustic unimodal neuro-symbol ‘voices’ both the multimodal neuro-symbols ‘breakfast’ and ‘meeting’ can be triggered. Which perception is actually the right one is decided by integration of knowledge and expectation. The knowledge that the neuroscience group is usually having a ‘social breakfast’ on Monday, 9:00 in the morning in the department kitchen increases the activation degree of the neuro-symbol ‘breakfast’ and decreases the activation degree of the neuro-symbol ‘meet-ing’.

171The Neuro-Symbolic Code of Perception

So far, conceptions of how and on what level knowledge and expectation intervene in the perception process have been vague and often controver-sial. In our neuro-symbolic coding model, we suggest that knowledge and expectation can influence the neuro-symbolic perceptual levels by acting as additional input signal for neuro-symbols and increasing or decreasing their activation degrees. An example how this may look like is given in figure 4.

Principally, all cortical levels (primary, secondary, and tertiary cortex) are conceivable for interaction between knowledge and perception [6]. Never-theless, simulation results show that connections to higher levels are more effective: it might not make sense to apply some high cognitive reasoning processes to something as simple as an edge detector in the primary cortex. Furthermore, simulation results show that using many excitatory connec-tions from higher levels to lower ones can cause oscillations in activations and therefore instability. To keep the system stable, the inhibition of activa-tions seems to be essential. The topic of stability has up to our knowledge so far not been considered in this context but is in fact an important issue.

Figure 5. Representation by separate neuro-symbols. On the unimodal neuro-sym-bolic level, each person is represented by a separate neuro-symbol ‘person’. There-fore, always feature neuro-symbols that represent features within a close spatial area (marked red, blue, and green) are combined.

172 Rosemarie Velik

Figure 6. Representation by a group neuro-symbol. Instead of representing each person by a separate unimodal neuro-symbol, there exists a neuro-symbol ‘group of persons’. This also allows perception even when some of the features of the particu-lar persons are missing (e.g., when covered by another object).

Figure 7. Serialization by focus of attention. Focus of attention serializes the per-ception process by directing processing power to one area at a time. For this pur-pose, the activation of feature neuro-symbols that fall into this area is increased in comparison to feature neuro-symbols outside this area. After processing is finished, the focus of attention is switched to the next area.

173The Neuro-Symbolic Code of Perception

A further crucial top-down process involved in perception is focus of attention. Like every information processing system, also the brain has a constrained processing capacity and is overloaded if too many different events occur at the same time. This is best explained by means of a concrete example (see figures 5 to 7). The example illustrates three possibilities how the brain could handle the perception of multiple objects or events. In the example, three persons are present in the environment and within the view of sight at the same time.

Following our model, one possibility for perceiving them would be to have available one neuro-symbolic representation for each person — i.e. three unimodal neuro-symbols ‘person’ — (see figure 5). However, consid-ering how many different images we can perceive that can occur in larger numbers, this would be an inefficient way of coding, especially in higher cortical levels as this would lead to a combinatorial explosion [28]. A possi-bility to overcome this problem would be to have available a neuro-symbol ‘group of persons’ instead of separate neuro-symbols for each person (see figure 6). This reduces the degree of detail of the perceptual information (e.g., the exact position of each person), but is sufficient for most purposes. To increase the information content in case of necessity, the mechanism called focus of attention can be applied. By focus of attention, the process-ing power is directed to a certain area of the environment which is currently relevant [3]. After having processed this information, attention is switched to the next area and so on. From neuroscience, it is not clear at which level focus of attention interacts with perception [6]. In the model, interaction is theoretically possible at all levels, however, as simulations show, interaction is most efficient at the feature neuro-symbol level (see figure 7). Features of objects and events located outside the area of attention at a certain time are not considered in the higher processing levels (for details see [31]). Addi-tional to locations, attention can also be directed to certain object features (e.g., colour, shape, motion) or modalities.

6. Binding without Problems

The perceptual system of the brain is a distributed system that consists of millions of nerve cells being interconnected and firing in parallel. This

174 Rosemarie Velik

raises the question of how this parallel processing of sensory information can finally result in a unified perception and experience. This is the so-called binding problem, which has so far been subject to many discussions and theories concerning its solution (see section 1). Each of these suggested solutions for sure takes some role in binding but never gives a complete answer. Nevertheless, the different solutions to the binding problem are not mutually exclusive [27]. Their combination might make it possible to give a satisfactory explanation of how binding is performed in the human brain. Such a coherent combination has in fact been achieved with the perceptual model proposed here. A detailed discussion of this solution, we have pre-sented in [36]. In the following, a brief summary is given:

It is suggested that in lower levels of perception, corresponding to the pri-mary cortices of the different sensor modalities, combination coding (also called ‘grandmother-cell’ coding) [1, 15] is used for binding of information from sensory receptors. The primary cortices have a topographic structure. It is argued that, e.g., for the visual cortex, different features like lines, edges, and colours at different locations in the visual fields are represented by different neurons (or in our case neuro-symbols). For the same feature but with neighbouring position in the visual field, neighbouring neuro-sym-bols are used for representation. In higher levels, corresponding to functions of the secondary and tertiary cortex, a combination of principles inspired from population coding and temporal coding is identified as suitable. Per-ceptual images are represented by neuron-groups like in population coding [20, 21] no matter at what location in the environment the images are per-ceived or what orientation or size they have. Also variations in appearance are possible. The location information is added as additional information, probably through special neural firing patterns like in temporal coding [42]. Location information is important for binding to not incorrectly bind together features that do not originate from the same object or event in the environment. The used learning concepts support the theory of hardwired versus on-demand binding [13]. Additionally, top-down mechanisms com-ing from knowledge and focus of attention are integrated that are in accor-dance with the bundling and bounding theory [43], binding by knowledge, expectation, and memory [6], binding by attention [26], the feature integra-tion theory of attention [26], and the theory of synchronization through top-

175The Neuro-Symbolic Code of Perception

down processes [5].

7. Model Simulation

To test and evaluate the functionality of the proposed model, a computer simulation was run in the simulation environment AnyLogic. AnyLogic allows simulated parallel processing and modular hierarchical design, which is crucial for the suggested model. Furthermore, it comprises different design elements that make it possible to realize neuro-symbols — the basic elements of the system — without major effort.

Input data for the model in form of sensor values (and also neuro-sym-bolic information) were provided by a sensor data generator. The purpose of this sensor data generator is to simulate the activation of sensory recep-tors (or low-level neuro-symbols) based on objects and events occurring in a virtual environment. The test scenarios that were simulated were different activities performed by persons in an office environment [2, 8, 31].

The sensory modalities originally used in the simulation were vision, auditory sense, and touch. However, the computational load that had to be coped with in the simulation turned out to be proportional to the number of neuro-symbols in the simulation, which resulted in a very time consum-ing simulation. Therefore, in order to reduce the computational load, only the tactile modality was simulated beginning from sensor values and car-rying on with the levels of processing of the primary cortex. The use of 361 sensory receptors turned out to be sufficient for a proof of concept and required a number of 14019 neuro-symbols arranged in different sub-layers [31]. The outcome of these lowest levels of processing was the tactile per-ception of shapes of three different sizes at different locations, being static or moving into one of eight possible directions with three different velocity ranges.

As there can be assumed similar processing principles for vision and audi-tion like for tactile perception (although with different features), the simula-tion of these lowest levels were omitted for the visual and auditory modal-ity. Therefore, the sensor value generator did not scale down visual and auditory stimuli to the sensory level but directly provided neuro-symbolic information entering the unimodal level, which represents the information

176 Rosemarie Velik

processing performed in the secondary cortex of these modalities. The uni-modal level was simulated for all three perceptual modalities and provided input for the multimodal level. Activated symbols of the multimodal level were considered as the outcome of the perception process and were used for the evaluation of the model functionality.

In the simulation, the top-down mechanisms were implemented as a sepa-rate module that received input data from the neuro-symbolic network and fed back information to it after processing. Internally, the top-down pro-cesses were represented in a rule-based format.

8. Discussion

In this article, we presented a concept, which makes it the first time possi-ble to explain and integrate the different levels of perception in one model. The key to this is to consider perception on a neuro-symbolic abstraction level and arranging neuro-symbols in a modular hierarchical architecture including bottom-up and top-down mechanisms. Computer simulations proved that the proposed model is actual technically implementable and functional.

The significance of the developed model from the point of view of neu-roscience is the fact that it is the first model that unifies the neural and the cognitive abstraction level of perception in a conclusive way. In contrast to former models that always only focused on certain aspects of perception, it provides a global picture of how perception works. It explains how neural firing can result in robust, unified, complex perceptions and how the differ-ent mechanisms involved into this process work together. Developing such a global framework of perception was a very important step, as this now makes it possible to handle particular issues of perception in more detail without losing track of the global functionality that at the end has to emerge from the different sub-tasks.

By this model, the purposes of feedbacks in the perceptual system can be explained and it can be illustrated how and on what level the top-down mechanisms knowledge and focus of attention interact with perception. The model presents a solution to the binding problem in perception — a prob-lem that has tackled researchers for more than 30 years. For this purpose,

177The Neuro-Symbolic Code of Perception

different existing approaches to solve the binding problem were merged and supplemented by insights from other areas of perceptual research. It turned furthermore out that location information is of crucial importance in binding. We propose that in fact, the textbook line that there exist two com-pletely distinct, separate pathways for object recognition and spatial object location in the visual system being directed towards two different brain areas should be reconsidered. What remains to be answered is how location information is actually coded for which the theory of temporal coding in binding might be a helpful starting point. Another option is that cross-modal neurons in the different modalities take over a function in location merging between modalities.

Furthermore, a hypothesis was presented how learning is performed in the brain, which is based on characteristics of neural growth, geometrical proximity, and Hebb’s rule of neural firing and wiring. Further experiments and simulations are planned to verify this hypothesis.

The last point to mention is that simulation results turned out that for a system like the brain with lots of feedbacks and top-down connections, the stability of the system is raised to question. It is therefore highly remark-able that the brain can achieve stable representations of perceptions and concepts — a fact that has so far not been sufficiently stressed by research-ers modeling brain functionalities. In our computational model, stability was achieved by integrating certain filter and inhibition mechanisms into the higher processing layers (from knowledge and focus of attention to the neuro-symbols). This implies that in the brain, higher cortical levels might be responsible for filtering information in order to guarantee stability — a hypothesis that will be subject to further verification.

Coming back to the discipline of machine perception, which was the point of departure for the research described here, the importance of the model lies in the fact that it provides a powerful and flexible tool for machine per-ception in order to automate processes for which, up to today, still human observers and their cognitive capabilities were necessary. The model has seen first applications in sensor fusion [34], scene perception and interpreta-tion [33], autonomous perception and decision making [40] as well as alert-ing [41] in interactive buildings.

178 Rosemarie Velik

References

[1] Barlow, H.B. Single Units and Sensation: A Neuron Doctrine for Perceptual Psychology. Perception, 1: 371-394, (1972).

[2] Burgstaller, W. Interpretation of Situations in Buildings. PhD thesis, Vienna University of Technology, (2007).

[3] Chun, M.M. and Wolfe, J.M., Visual Attention, Blackwell’s Handbook of Perception, Goldstein, B. (ed.), Blackwell, 272-310 (2001).

[4] Dietrich, D., Fodor, G., Zucker, G., and Bruckner, D. Simulating the Mind: A Technical Neuropsychoanalytical Approach. Springer Verlag, (2008).

[5] Engel, A. K., Fries, P., and Singer, W.. Dynamic Predictions: Oscillations and Synchrony in Top-Down Processing. Nature Reviews Neuroscience, 2, October (2001).

[6] Ernst, M.O. and Buelthoff, H.H. Merging the Senses into a Robust Percept. TRENDS in Cognitive Sciences, 8(4): 162-169, April (2004).

[7] Gafarov, F., Khusnutdinov, N. and Galimyanov, F. The simulation of the activity dependent neural network growth. Neurons and Cognition. arXiv:0903.1012v1 [q-bio.NC], (2009).

[8] Goetzinger, S.O. Scenario Recognition based on a Bionic Model for Multi-Level Symbolization. Master thesis, Vienna University of Technology, (2006).

[9] Goldstein, E.B. Wahrnehmungspsychologie. Spektrum Akademischer Verlag, (2002).

[10] Golledge, H.D.R., Hilgetag, C.C., and Tovee, M.J. Information Processing: A Solution to the Binding Problem? Current Biology, 6(9): 1092-1095, (1996).

[11] Hebb, D.O. The oranization of behavior: A neuropsychological theory. Lawrence Erlbaum, (2002).

[12] Hebb, D.O. The Mammal and his Environment. Am. J. Psychiatry, 111(11):826-831, (1955).

[13] Hommel, B. and Colzato, L.S. When an Object is more than a Binding of its Features: Evidence for Two Mechanisms of Visual Feature Integration. Visual Cognition, 17:120-140, (2009).

[14] Hubel, D.H. Eye, Brain, and Vision. W H Freeman and Co, (1988).[15] Hubel, D.H. and Wiesel, T.N. Receptive Fields, Binocular Interaction and

Functional Architecture in the Cats Visual Cortex. J. Physiol., 160, 106-154, (1962).

[16] Husserl, E. The Phenomenology of Internal Time Consciousness. trans. J.S. Churchill. Bloomington, IN: Indiana University Press, (1964).

[17] Lang, R., Bruckner, D., Velik, R., and Deutsch, T. Scenario Recognition in Modern Building Automation. International Journal of Intelligent Systems and

179The Neuro-Symbolic Code of Perception

Technologies, 4(1): 36-44, (2009). [18] Luria, A.R. The Working Brain — An Introduction in Neuropsychology. Basic

Books, (1973).[19] Pulver, S. Synaptic activity makes fly neurons shape up. Journal of

Experimental Biology, 212, (2009).[20] Quian Quiroga, R., Reddy, L., Kreiman, G., Koch, C. and Fried, I. Invariant

Visual Representation by Single Neurons in the Human Brain. Nature, 435:1102-1107, (2007).

[21] Quian Quiroga, R., Kreiman, G., Koch, C. and Fried, I. Sparse but not ‘Grandmother-cell’ Coding in the Medial Temporal Lobe. Trends in Cognitive Sciences, 12(3): 87-91, (2007).

[22] Seric, L. Stipanicev, D. and Stula, M. Observer network and forest fire detection. Information Fusion, doi:10.1016/j.inffus.2009.12.003, article in press, (2009).

[23] Shatz, C.J. The developing brain. Sci. Am. 267: 60-67 (1992).[24] Solms, M. and Turnbull, O. The Brain and the Inner World An Introduction to

the Neuroscience of Subjective Experience. Other Press New York, (2002).[25] Tani, J. The Dynamical Systems Accounts for Phenomenology of Immanent

Time: An Interpretation by Revisiting a Robotics Synthetic Study. Journal of Consciousness Studies, 11(9): 5-24, (2004).

[26] Treisman, A.M. and Gelade, G. A Feature-Integration Theory of Attention. Cognitive Psychology, 12: 97-136, (1980).

[27] Treisman, A. The Binding Problem. Current Opinion in Neurobiology, 6: 171-178, (1996).

[28] Triesch, J and von der Malsburg, C. Binding: A proposed experiment and a model. Proceedings of the ICANN 96, 685-690. Springer Verlag, (1996).

[29] Velik, R. A Model for Multimodal Humanlike Perception based on Modular Hierarchical Symbolic Information Processing, Knowledge Integration, and Learning. Proceedings of the 2nd International Conference on Bio-Inspired Models of Network, Information, and Computing Systems, (2007).

[30] Velik, R. and Bruckner, D. Neuro-Symbolic Networks: Introduction to a New Information Processing Principle. Proceedings of the 6th International Conference on Industrial Informatics, (2008).

[31] Velik, R. A Bionic Model for Human-like Machine Perception. SVH Verlag, (2008).

[32] Velik, R. A Bionic Model for Human-like Machine Perception. PhD Thesis, Vienna University of Technology, (2008).

[33] Velik, R. and Bruckner, D. A Bionic Approach to Dynamic, Multimodal Scene Perception and Interpretation in Buildings. International Journal of Intelligent

180 Rosemarie Velik

Systems and Technologies, 4(1): 1-9, (2009).[34] Velik, R., Bruckner, D., Lang, R., and Deutsch, T. Emulating the Perceptual

System of the Brain for the Purpose of Sensor Fusion. In: Human-Computer System Interaction: Backgrounds and Applications, pp. 17-27, Springer Berlin/Heidelberg, (2009).

[35] Velik, R., Bruckner, D., and Palensky, P. A Bionic Approach for High-Efficiency Sensor Data Processing in Building Automation. Proceedings of the 35th Annual Conference of the IEEE Industrial Electronics Society, (2009).

[36] Velik, R. From Neuron-firing to Consciousness — Towards the True Solution of the Binding Problem. Neuroscience and Behavioral Review, 34(7): 993-1001, (2010).

[37] Velik, R. Why Machines Cannot Feel. Minds and Machines, Springer, 20(1): 1-18, (2010).

[38] Velik, R. Towards Human-like Machine Perception 2.0. International Review on Computers and Software (IRECOS), Special Section on Advanced Artificial Networks, July, (2010).

[39] Velik, R. Quo Vadis, Intelligent Machine? Brain (Broad Research in Artificial Intelligence and Neuroscience), 1(4), October, (2010).

[40] Velik, R. and Zucker, G. Autonomous Perception and Decision Making in Buildings. IEEE Transactions on Industrial Electronics, 57(11) November, (2010).

[41] Velik, R. and Boley, H. Neuro-symbolic Alerting Rules. IEEE Transactions on Industrial Electronics, 57(11), November, (2010).

[42] von der Malsburg, C. The What and Why of Binding: The Modelers Perspective. Neuron, 24: 95-104, (1999).

[43] Wolfe, J.M. and Cave, K.R. The Psychophysical Evidence for a Binding Problem. Neuron, 24: 11-17, September (1999).