Visuo-verbal interactions in working memory: Evidence from event-related potentials
-
Upload
jan-peters -
Category
Documents
-
view
212 -
download
0
Transcript of Visuo-verbal interactions in working memory: Evidence from event-related potentials
www.elsevier.com/locate/cogbrainres
Cognitive Brain Research
Research Report
Visuo-verbal interactions in working memory: Evidence from
event-related potentials
Jan Petersa,b,*, Boris Suchanb, Yaxin Zhanga, Irene Dauma,b
aInternational Graduate School of Neuroscience, Ruhr-University Bochum, GermanybInstitute of Cognitive Neuroscience, Department of Neuropsychology, Ruhr-University Bochum, Germany
Accepted 10 July 2005
Available online 15 August 2005
Abstract
Working memory is thought to involve separate modality-specific storage systems. Interactions between these storage systems were
investigated using a novel cross-modal 2-back paradigm. 2-back, 1-back and target items were presented either visually as a verbalizable
linedrawing or auditorily as a digitized spoken word. ERPs for auditory targets were primarily modulated by the presentation modality of the 2-
back item, whereas ERPs for visual targets were largely modulated by presentation modality of the 1-back item. Results indicate that
verbalizable pictures are only partially transformed into a phonological code for rehearsal in working memory. Furthermore, results support the
idea of a more stable and persistent auditory short-term store as opposed to a more transiently activated visual store for verbalizable material.
D 2005 Elsevier B.V. All rights reserved.
Theme: Neural Basis of Behavior
Topic: Cognition
Keywords: Working memory; Event-related potential; N100; P200; Slow wave; Visuo-verbal; Cross-modal
1. Introduction
Working memory, the short-term maintenance and manipu-
lation of information, is a critical component of intelligent
behavior. Specific subsystems have been described for the
maintenance of material from different modalities. The original
model of working memory involved short-term buffers for
phonological and visuo-spatial material under the control of a
central executive [1,3]. The interaction between the two
modality-specific storage systems is as yet unclear.
The issue of cross-modal transformation of content in
working memory has mainly been investigated with respect
to a possible phonological re-coding of verbalizable visual
material. Based on findings from earlier psychophysical
0926-6410/$ - see front matter D 2005 Elsevier B.V. All rights reserved.
doi:10.1016/j.cogbrainres.2005.07.001
* Corresponding author. Institute of Cognitive Neuroscience, Depart-
ment of Neuropsychology, Ruhr-University Bochum, Universitatsstraße
150, D-44780 Bochum, Germany. Fax: +49 234 32 14622.
E-mail address: [email protected] (J. Peters).
experiments, Penney [22] suggested a dual processing
stream model, with partly independent and partly over-
lapping processes for visually and auditorily presented
words. By contrast, Schumacher et al. [27] reported findings
from a PET study, in which regions involved in rehearsal of
visual–verbal and auditory–verbal material showed almost
complete overlap. Smith and Jonides [13,28] suggested a
three-component model of working memory, with separate
storage systems for verbal, spatial and object material. It has
been argued that due to the superior storage capacities of the
phonological loop, verbalizable visual material is trans-
formed into a phonological code for rehearsal in phono-
logical working memory [2,19,28].
There is, however, some evidence against this recoding-
view, as far as visual–verbal (non-pictoral) material is
concerned. Working memory for auditorily and visually
presented verbal material has been previously investigated
using event-related potentials [25]. This study revealed a
fronto-central negativity associated with an auditory–verbal
25 (2005) 406 – 415
J. Peters et al. / Cognitive Brain Research 25 (2005) 406–415 407
processing stream and a transient posterior positivity
associated with a visual–verbal processing stream [25].
There is evidence from a recent fMRI study that auditory
verbal working memory recruits larger regions in the dorso-
lateral prefrontal cortex compared to visual verbal working
memory, whereas the latter is associated with increased
activity in left posterior parietal cortex, more specifically in
the intraparietal sulcus [8]. Processing of visual and auditory
verbal material may thus involve distinct cortical circuits.
As retention time increases, these processes are thought to
converge onto a unitary, encoding modality independent
rehearsal process [7,25], indicating that high temporal
resolution is essential for an in-depth investigation of
modality specific coding in working memory.
Additionally, visual and auditory verbal short-term
storage systems may differ with respect to their temporal
characteristics. Behavioral and event-related potential data
[22,25] both point towards a more stable and persistent
auditory store and a more transiently activated visual store
for verbal material.
To investigate interactions between auditory and visual
short-term storage for verbalizable material, a novel cross-
modal 2-back paradigm was developed with both auditory
(spoken words) and visual (verbalizable linedrawings) items
presented at encoding and retrieval. Processes that occurred
when a target item from one modality had to be matched to a
2-back item that had been encoded in the same or a different
modality were monitored using ERPs. Attentional and
working memory demands (i.e. number of items to be
rehearsed) were held constant while encoding and retrieval
modalities were systematically varied [31].
We hypothesized that coding differences for visual and
auditory material should not only manifest themselves in
different patterns of brain activity associated with rehearsal
[25], but also in different brain activity patterns associated
with matching represented in early ERP components. For
auditorily presented targets, we expected visual encoding to
give rise to differences in brain activation during retrieval
only if the phonological recoding of the visual items is
incomplete, that is, if the coding of verbalizable pictures in
working memory is at least partly visuo-spatial. Secondly,
we hypothesized that if activation in the visual store decays
more rapidly, re-activation of this system after auditory
stimulation should give rise to enhanced activity reflected in
the visual ERP. At the same time, a more stable and
persistent auditory store might be able to maintain activation
over a longer period of time and thus not be subject to such
switching effects.
2. Methods
2.1. Participants
Fourteen healthy human subjects participated in the
experiment (mean age 24.6, 6 participants were male).
One participant was left-handed. Participants were univer-
sity students who received course credit for their
participation.
2.2. Stimuli
Fifteen German words spoken by a male speaker served
as auditory stimuli. For visual stimulation, the correspon-
ding pictures, taken from a standardized picture set [29],
were used. Pictures were matched both for name agreement,
thus ensuring verbalizability of the items, and visual
complexity [29] in order to avoid differential visual memory
load effects. Additionally, only pictures corresponding to
disyllabic German words were chosen. It has been shown
that immediate serial recall of disyllabic words is not
influenced by the exact spoken duration of the words [15],
indicating that phonological working memory load for these
words can be considered comparable. The following words/
linedrawings were used: apple, flower, glasses, hammer,
jacket, bottle, saw, fork, screw, crown, socks, cat, bicycle,
whistle, scissors.
2.3. Procedure
A modified 2-back paradigm (Fig. 1) was used. Stimuli
were continuously presented for 700 ms each, followed by
the presentation of a fixation cross for 1700 ms. For each
stimulus, participants were instructed to make a same/
different judgement with respect to the 2-back item, that is,
the stimulus presented two trials earlier. Participants were
instructed that content, not modality, was relevant. Eight
blocks of 60 trials each were administered. In each block,
the experimental conditions were presented in a pseudo-
randomized fashion. After each block, participants started
the subsequent block by pressing a button.
The task was continuous in the sense that the 1-back item
of any given trial was the 2-back item in the subsequent trial
and so on. The 1-back item thus served as an interfering
item which was nonetheless task relevant in the subsequent
trial.
2.4. Conditions
The experimental setup included 8 experimental con-
ditions, depending on the presentation modalities of 2-back
item, 1-back item and target item (see Fig. 1). Accord-
ingly, throughout this report, conditions will be labelled
with letter-triplets where an ‘‘A’’ represents auditory
presentation and a ‘‘V’’ visual presentation. VAA thus
labels the condition in which the 2-back item was
presented visually whereas 1-back and target item were
both presented auditorily. Trials with a modality change
from 2-back item to target item will be termed ‘‘Trans-
formation’’ trials. Trials with a modality change from 1-
back item to target item will be termed ’’Switch’’ trials
(see Table 1).
Fig. 1. Schematic outline of the cross-modal 2-back paradigm. Stimuli were presented continuously. Here, a trial from condition VAA is shown. The 2-back
item was presented visually whereas 1-back item and target item were presented auditorily. A transformation occurs because 2-back and target item are not
presented in the same modality. Switching is not required because 1-back item and target item were both presented auditorily.
J. Peters et al. / Cognitive Brain Research 25 (2005) 406–415408
2.5. Electrophysiological recording
EEG was recorded with tin electrodes mounted in an
elastic cap from 30 scalp positions (F7, F3, FZ, F4, F8, FT7,
FC3, FCZ, FC4, FT8, T7, C3, CZ, C4, T8, TP7, CP3, CPz,
CP4, TP8, P7, P3, PZ, P4, P8, PO7, PO3, POz, PO4, PO8)
according to the extended 10–20 system. All scalp electro-
des were referenced to linked mastoids and impedance was
kept below 5 kOhm. To control for ocular artefacts, eye
movements were also recorded. EEG was sampled at a rate
of 500 Hz.
Subjects were seated comfortably in an electrically and
acoustically shielded dimmed room in front of a 17 in.
computer monitor at a distance of about 70 cm. Two 90-arranged response keys were used.
The EEG data were analyzed off-line using the Brain
Vision Analyzer software package. After filtering the raw
data with a low pass filter of 40 Hz with 12 db, data were
segmented to the presentation of the target stimulus, from
200 ms pre-presentation to 1000 ms after presentation. After
removing segments containing artefacts, ocular correction
[12] was carried out. A 200-ms pre-stimulus interval was
Table 1
Experimental conditions
Transformation [+] Transformation [�]
Switch [+] VVA, AAV AVA, VAV
Switch [�] VAA, AVV AAA, VVV
Note: Transformation represents a modality difference between target item
and 2-back item. Switching represents a modality difference between target
item and 1-back item (see text). Letter triplets refer to the modalities
(auditory—A, visual—V) of 2-back item, 1-back item and target item,
respectively.
used for baseline correction. Thereafter, grand averaged
ERPs were obtained and pooled across all subjects (Figs. 2
and 3).
2.6. Data analysis
Response accuracy and reaction time were recorded.
Only ERPs from correct 2-back trials are included in this
report. N100 and P200 peak components were determined
(see Figs. 2 and 3). N100 was defined as the maximum
negative peak in the time window 80–180 ms, whereas
P200 was defined as the maximum positive peak in the time
window 150–250 ms.
Slow wave amplitude (see Figs. 2 and 3) was analyzed
by computing the mean amplitude in a 200-ms time
window. Window measures ranged from 350 to 550 ms
(auditory slow wave) and from 200 to 400 ms (visual slow
wave). The different window measures for the auditory and
the visual modality were chosen to account for the differ-
ences in reaction time as a function of target modality. Mean
reaction time was �150 ms slower for auditory trials (see
Fig. 4a). Auditory and visual slow wave measures therefore
each cover a 200-ms time segment prior to a point
approximately 270 ms before motor response onset. The
subsequent 200 ms of slow wave activity were also analyzed
in both modalities. Since the effects largely resemble those
found for the earlier portion of the slow wave, the data are
not reported here.
2.7. Statistics
Repeated measure ANOVAs with Greenhouse–Geisser
correction were performed for analysis of the behavioral and
Fig. 2. Grand average event-related potentials from all subjects (n = 14) to auditory targets in all conditions. N100, P200 and slow wave amplitudes were
analyzed (see text).
J. Peters et al. / Cognitive Brain Research 25 (2005) 406–415 409
electrophysiological data. Except for response accuracy, all
measures were analyzed separately for auditory and visual
targets. Behavioral data analysis included the factors cross-
modal Transformation and Switching (Table 1). For
technical reasons (hardware problem), the response accu-
racy (RA) and reaction time (RT) data of one subject had to
Fig. 3. Grand average event-related potentials from all subjects (n = 14) to visua
analyzed (see text).
be excluded from analysis. The subject’s data were used in
the electrophysiological analysis.
Electrophysiological data analysis included the factors
Transformation, Switch, Electrode position (AP, anterior:
F7, Fz, F8, central: T7, Cz, T8, posterior: P7, Pz , P8) and
Hemisphere (HM, left: F7, T7, P7, central: Fz, Cz, Pz, right:
l targets from all conditions. N100, P200 and slow wave amplitudes were
Fig. 4. (a) Reaction times (RT) as a function of the experimental conditions.
Visual RTs were significantly prolonged for AAV and VAV as compared to
VVVand AVV ( P < 0.005). (b) Response accuracy (RA) as a function of the
experimental conditions. Auditory RAwas significantly higher for AAA and
AVA as compared to VAA and VVA ( P < 0.05). Error bars denote SEM.
J. Peters et al. / Cognitive Brain Research 25 (2005) 406–415410
F8, T8, P8). Due to large differences in the ERPs as a
function of target modality, electrophysiological data were
analyzed separately for each modality.
3. Results
3.1. Behavioral data
3.1.1. Auditory reaction times
Auditory RTs were generally about 150 ms slower than
visual RTs (mean � 825 ms, see Fig. 4a) due to the
extended presentation time for spoken words compared to
pictures. Statistical analysis of auditory RTs did not reveal
significant effects for Transformation (F(1,12) = 1.228, P =
0.290), Switch (F(1,12) = 4.008, P = 0.068) or Trans-
formation � Switch (F(1,12) = 0.433, P = 0.523).
3.1.2. Visual reaction times
Analysis of visual RTs revealed a significant main effect
of Switch (F(1,12) = 13.520, P < 0.005). RTs for visual
targets were significantly increased when the 1-back item
was presented auditorily as opposed to when it was presented
visually (see Fig. 4a). No significant effects were found for
the factors Transformation (F(1,12) = 1.790, P = 0.206) and
Transformation � Switch (F(1,12) = 1.306, P = 0.275).
3.1.3. Response accuracy
Response accuracy was generally above or around 80% in
all conditions (Fig. 4b). Statistical analysis of response
accuracy revealed a significant Transformation � Target
Modality interaction (F(1,12) = 7.129, P < 0.05). To further
explore this interaction, Transformation effects were ana-
lyzed for each modality separately. For auditory targets, a
main effect for Transformation reached significance (F(1,12) =
8.662, P < 0.05), indicating that participants made signifi-
cantly more errors in auditory trials when the 2-back stimulus
was visual compared to when it was auditory. For visual
trials, neither Transformation (F(1,12) = 3.933, P = 0.071) nor
Transformation � Switch (F(1,12) = 3.792, P = 0.075)
attained significance.
3.2. Electrophysiological data
3.2.1. Auditory targets
Grand average potentials for auditory targets with data
pooled across all participants (n = 14) are depicted in Fig. 2.
Auditory target items gave rise to an N100-P200 complex,
followed by slow wave activity.
3.2.1.1. N100. The interactions Transformation � Switch
(F(1,13) = 12.594, P < 0.005) and Transformation � Switch �Hemisphere (F(2,26) = 6.748, P < 0.01) were significant.
Further analysis of the three way interaction revealed
significant Transformation�Switch interactions at left
(F(1,13) = 9.525, P < 0.01) central (F(1,13) = 12.774, P <
0.01) and right hemisphere sites (F(1,13) = 9.708, P < 0.01). As
can bee seen from Fig. 2, N100 effects were most pronounced
at site Cz. At Cz, the interaction Transformation � Switch was
significant (F(1,13) = 11.978, P < 0.005, see Fig. 5a). Further
analysis revealed that here, N100 amplitude in condition VVA
was significantly larger than in conditions AVA (P < 0.005),
VAA (P < 0.005) and AAA (P < 0.05).
There were no significant effects of Transformation
(F(1,13) = 0.700, P = 0.418), Switch (F(1,13) = 0.067, P =
0.800) or Transformation � Switch (F(1,13) = 0.730, P =
0.408) on auditory N100 peak latency.
3.2.1.2. P200. For P200 amplitude, a main effect of
Transformation emerged (F(1,13) = 9.369, P < 0.01).
Auditory P200 amplitude was thus significantly reduced
for trials where the 2-back was presented visually compared
to auditory 2-back presentation. P200 amplitude was most
pronounced at site Pz. The main effect of Transformation
was also significant at site Pz (F(1,13) = 9.005, P < 0.05, see
Fig. 5b), with VAA and VVA exhibiting smaller amplitudes
than AAA and AVA.
There were no significant effects of Transformation
(F(1,13) = 0.026, P = 0.873), Switch (F(1,13) = 1.545, P =
0.236) or Transformation � Switch (F(1,13) = 0.031, P =
0.862) on P200 latency.
3.2.1.3. Slow wave. A significant main effect of Trans-
formation on auditory slow wave amplitude was found
(F(1,13) = 6.671, P < 0.05). Visual inspection indicated that
this effect was most pronounced at the left-frontal electrode
Fig. 5. Mean amplitudes of auditory N100 at site Cz (a) and auditory P200 at site Pz (b). The observed effects were usually significant across all electrodes or
a large subset of electrodes submitted to the ANOVAs (see text). They were most pronounced at the depicted electrode positions. Significance levels (*P <
0.05) indicate results of ANOVAs conducted at the selected electrodes. (a) Auditory N100 amplitude was larger in condition VVA as compared to the other
conditions. (b) Auditory P200 was significantly reduced for Transformation conditions VAA and VVA as compared to AAA and AVA. Error bars denote
SEM.
J. Peters et al. / Cognitive Brain Research 25 (2005) 406–415 411
F7. At site F7, the main effect of Transformation was also
significant (F(1,13) = 8.811, P < 0.05, graph not shown). The
interaction Transformation � Switch did not reach signifi-
cance (F(1,13) = 0.077, P = 0.786).
3.2.2. Visual targets
Grand average potentials for visual targets with data
pooled across all participants (n = 14) are depicted in Fig. 3.
Visual targets also gave rise to an initial N100 peak
followed by a positive deflection. A pronounced positive
slow wave followed.
3.2.2.1. N100. For visual targets, a main effect of Switch
(F(1,13) = 12.066, P < 0.005) emerged. The interaction
Transformation � Switch, however, was not significant
(F(1,13) = 4.593, P = 0.052). An examination of the Switch
effect revealed that visual N100 amplitude was generally
larger when the 1-back item had been presented visually as
opposed to auditorily. Furthermore, there was a significant
Switch � Electrode position interaction (F(1,13) = 5.636,
P < 0.05), indicating that the Switch effect was significant
at anterior (F(1,13) = 13.364, P < 0.005) and central
electrodes (F(1,13) = 11.785, P < 0.005), but was not
significant at parietal sites (F(1,13) = 4.464, P = 0.055).
Visual N100 effects were most pronounced at site Cz.
Here, the main effect of Switch was also significant
(F(1,13) = 8.059, P < 0.05, see Fig. 6a).
3.2.2.2. P200. For visual P200, the interaction Switch �Hemisphere reached significance (F(1,13) = 8.671, P <
0.005). However, a subsequent analysis of this effect for
each hemisphere separately did not yield significant effects
for Switch at left (F(1,13) = 0.708, P = 0.415), central
(F(1,13) = 4.481, P = 0.054) or right electrodes (F(1,13) =
0.281, P = 0.602).
3.2.2.3. Slow wave. Asignificantmain effect of Switchwas
found (F(1,13) = 16.234,P < 0.005) with increased amplitudes
for conditions AAV and VAV compared to VVV and AVV.
The factor Transformation was not significant (F(1,13) =
4.396, P = 0.056). The interactions Switch � Hemisphere
(F(2,26) = 6.198, P < 0.05), and Transformation � Switch �Hemisphere (F(2,26) = 5.137, P < 0.05), also reached
significance. A subsequent analysis of the latter 3-way
interaction for each hemisphere separately revealed that the
interaction Transformation � Switch reached significance at
central electrodes (F(1,13) = 4.708, P < 0.05) but not at left
(F(1,13) = 0.058, P = 0.813) or right hemisphere sites
Fig. 6. Mean amplitudes of visual N100 at site Cz (a) and visual slow wave at site Pz (b). The observed effects were usually significant across all electrodes or a
large subset of electrodes submitted to the ANOVAs (see text). Effects were most pronounced at the depicted electrode positions. Significance levels (*P <
0.05; **P < 0.01) indicate results of ANOVAs conducted at the selected electrodes. (a) Visual N100 was significantly reduced for the Switch conditions VAV
and AAV compared to VVVand AVV. (b) Visual slow wave amplitude in the time window 200–400 ms poststimulus was significantly increased for the Switch
conditions VAV and AAV compared to VVV and AVV. Error bars denote SEM.
J. Peters et al. / Cognitive Brain Research 25 (2005) 406–415412
(F(1,13) = 2.353, P = 0.149). This effect was most
pronounced at site Pz. At this site, the main effect of
Switch was also significant (F(1,13) = 11.978, P < 0.005,
see Fig. 6b).
4. Discussion
In the present study, visuo-verbal working memory
processes were investigated in a cross-modal 2-back
paradigm. The task involved modality switching between
encoding and retrieval as well as between 1-back item and
target item. Behavioral and electrophysiological responses
to auditory targets were found to be modulated by the
presentation modality of the 2-back item, whereas responses
to visual targets were primarily modulated by the modality
of the 1-back item. Transformation trials were those trials
where target and 2-back item were not in the same modality,
whereas Switch trials were those trials where target and 1-
back item were not in the same modality.
Increased error rates were found for auditory Trans-
formation trials. This effect was not found for visual
Transformation trials. Behavioral data indicate that this
effect only occurs when an auditory target item has to be
matched to a visually encoded 2-back item but not vice
versa. As one cognitive component of the 2-back paradigm
is an embedded matching subtask [31], this effect could be
attributed to an increased difficulty in performing this task
in auditory Transformation trials. The data may thus point
towards an asymmetry with respect to the ease with which
auditorily and visually encoded verbalizable material can be
successfully matched in working memory. It has been
argued that auditorily encoded verbal material is stored in a
phonological code that has more sensory aspects than the
phonological code which is created when verbalizable
visual material is phonologically recoded [22]. In the
context of her ‘‘separate-streams model’’, Penney [22] has
termed this code the A-code (acoustic code) as opposed to
the P-code, which corresponds to the memory trace created
upon phonological re-coding of visual material. The RA
data of the present study indicate that a representation in P-
code is matched more easily onto a representation in A-code
than the reverse, thus yielding further evidence for the
proposed distinction by Penney [22].
RT analysis revealed prolonged RTs for visual Switch
trials. This effect was not observed for auditory Switch
trials. In the model of separate processing streams for
auditory verbal and visual verbalizable material, Penney has
J. Peters et al. / Cognitive Brain Research 25 (2005) 406–415 413
presented behavioral evidence for different temporal char-
acteristics of these streams [22]. More recently, Ruchkin and
colleagues have reported encoding modality-dependent
patterns of rehearsal-related brain activity supporting
Penneys proposal [25]. In this context, the observed RT
effects can be interpreted as further evidence for distinct
temporal characteristics of short-term storage of verbal-
izable material as a function of the encoding modality. We
suggest that encoding of a novel verbalizable picture into
working memory requires re-activation of a visual or visuo-
spatial store that may correspond to the visual–verbal store
described by Penney as well as to Baddeleys visuo-spatial
sketchpad. In accordance with the view that activity in this
store decays more rapidly than in the auditory store, longer
RTs for visual Switch trials possibly reflect increased
processing resources allocated to the re-activation of the
visual store.
Generally, the suggested interpretation of the behavioral
data in terms of Penneys separate-streams model is also
supported by the electrophysiological data. The distinction
between a representation of auditory items in A-code and of
phonologically recoded visual items in P-code, which has
less auditory sensory properties and faster decay, is reflected
by the findings for auditory N100. N100 amplitude was
largest for condition VVA as opposed to the other
conditions. Firstly, the possibility that this finding is merely
a result of attenuation or of refraction, processes known to
modulate N100 amplitude [20], needs to be considered.
Attenuation can be ruled out because, in this case, a similar
(though less pronounced) effect would be expected for
condition AVA. However, a tendency for the reverse pattern
was found. An explanation of the observed N100 effects in
terms of refractory processes can be excluded. Refractory
effects would imply reduced N100 amplitude after repetitive
auditory stimulation due to saturation. As there were no
significant differences between AVA on the one hand and
AAA and VAA on the other hand, the data do not suggest
that refractory effects played a role. Since the N100 is
generated partly in primary and secondary auditory cortices
[20], which are known to be involved in auditory working
memory [5], it is likely that the N100 findings can be
interpreted in the context of working memory processes.
Auditory N100 has been previously shown to index working
memory operations [6,10,11].
N100 amplitude for auditory probe items was previously
found to decrease linearly as memory load of auditorily
presented digits increased [6]. In response to auditory
probes, this effect was absent for visually encoded memory
set items in a Sternberg task [11]. Thus, N100 reduction
appears to index memory load for auditorily encoded items
only. The present finding of largest N100 for VVA, the
condition with no auditorily encoded task relevant items in
working memory (2-back and 1-back item have been
encoded visually), is therefore in accordance with the
literature. Additionally, these findings support the notion
that auditorily and visually encoded material is represented
in qualitatively different codes in working memory. Given
that the effect is found in the N100 component, a component
that has been argued to partly reflect activity in auditory
cortices [20], this finding is supportive of Penney’s proposal
that material needs to be encoded auditorily in order to be
represented in the more sensory-based A-code.
Transformation gave rise to a reduced auditory P200
amplitude. In functional terms, modulation of the auditory
P200 has been implicated in attentional processing [16,32].
Attenuation of auditory P200 amplitude has been suggested
to reflect reduced allocation of attentional resources towards
the eliciting stimulus [21]. The modulation of the P200 can
therefore be interpreted in terms of a modulation of the
matching subtask of the 2-back [31]. Unlike in the
commonly used Sternberg paradigm [30], in the 2-back
task, participants know that they have to match the target
stimulus to the 2-back item. Thus, an expectancy for the
target item or for features of the target item can be
generated. In the present study, the expectation for a visual
target when the 2-back had been encoded visually was
reflected in reduced P200 amplitude for the auditory target.
Had the visual items been converted into a representational
code exactly matching the representation after auditory
encoding, no such effect would be expected. Additionally,
the P200 findings parallel the behavioral findings for RA.
Reduced attentional resource allocation to the target item
may be one potential source for the increased error rates in
auditory Transformation trials.
Along similar lines, the findings of prolonged RTs for
visual Switch trials are paralleled by increased slow wave
activity in these trials. Furthermore, numerically, slow wave
amplitude increased the longer ago the last visual stim-
ulation occurred. At least two alternative interpretations of
this effect appear plausible. On the one hand, positive
parietal slow waves have been shown to index the retention
of visual object information [4]. Also, slow wave ampli-
tudes were found to be more negative in conditions of
higher visuo-spatial memory load [24]. Consistent with
these results, we observed the most pronounced positive
slow wave amplitude in AAV. If we assume a partially
visuo-spatial representation of the pictures in working
memory, AAV clearly has the lowest visuo-spatial memory
load of all visual conditions. The present slow wave
findings may thus reflect differences in the amount of
visuo-spatial information that is held in working memory.
On the other hand, in the context of Penney’s ‘‘separate-
streams model’’, it could be argued that the slow wave data
reflect a switching process that occurs when a more
transient store for visual material is reactivated after
encoding of an auditory item. The latter interpretation
would be more consistent with the findings for visual RTs.
Most likely, however, the present findings reflect a
combination of both factors.
As far as the representation of verbalizable material in
working memory is concerned, the present results indicate
that, depending on the encoding modality, differences can
J. Peters et al. / Cognitive Brain Research 25 (2005) 406–415414
be observed, both on the behavioral and the electro-
physiological level. Furthermore, the present findings are
consistent with the hypothesis that the phonological
representation of auditorily encoded verbal material only
partially overlaps with the representation of phonologically
recoded verbalizable linedrawings. What do the present
results contribute to the question of the underlying neural
mechanisms? Patients with left-hemisphere lesions are
consistently reported to show impaired verbal memory,
whereas a greater impairment of nonverbal memory is
found in right hemisphere lesion patients [17,18,23].
Similarly, neuroimaging studies have reported a functional
specialization of the left hemisphere for verbal encoding
and the right hemisphere for nonverbal encoding, in regions
of the medial temporal lobes and the prefrontal cortex
[9,14]. More specifically, the involvement of left and right
hemisphere in memory encoding has been found to be
modulated by the degree to which the encoded material
could be verbalized. In one study, left lateralized encoding
activity was observed for words, right lateralized activity
for unfamiliar faces and bilateral frontal activitation for
namable objects [14]. This may be the reason why no
lateralized effects have been found in the present study. As
a verbalization strategy was clearly most the most efficient
way to perform the cross-modal 2-back task, it is likely that
no subject adopted a purely visuo-spatial rehearsal strategy.
Generally, our results lend some support to the notion that
auditorily encoded words are represented in a more sensory-
based code than are phonologically recoded pictures. It has
recently been proposed that working memory rehearsal
involves the activation of posterior sensory cortices by
regions in the prefrontal cortex [26]. Thus, by means of
‘‘attentional pointers’’ [26], prefrontal cortex is thought to
maintain activity in those brain regions that have initially
been involved in the perception of the particular stimulus.
The present auditory N100 effects lend some support to this
idea, as this ERP component is thought to reflect to a
considerable extent the activity of primary and secondary
auditory association cortices [20]. Our results suggest that
these regions may be more involved in the rehearsal of
auditorily as compared to visually encoded verbalizable
material.
While the present results support previous findings
concerning more persistent storage of auditorily encoded
verbal material, the question arises if this is a unique feature
of the auditory modality in general, or if this pertains only to
phonological material. Addressing this question would
necessitate the use of non-phonological auditory material
such as non-verbalizable sounds. It has been suggested that
rehearsal of auditory material may be facilitated by its
intrinsic temporal characteristics [22]. As this does not only
apply to verbal but also to a certain extent to non-verbal
auditory material, temporal persistence may be hypothesized
to be a more general feature of the auditory modality,
regardless of whether the material is verbal or not. Further
research addressing this issue is clearly needed.
Acknowledgments
This research was funded by the International Graduate
School of Neuroscience, Ruhr-University Bochum, and the
German Research Society (DA 259/9-1).
References
[1] A. Baddeley, Working Memory, Oxford Univ. Press, New York,
1986.
[2] A. Baddeley, The episodic buffer: a new component of working
memory? Trends Cogn. Sci. 4 (2000) 417–423.
[3] A. Baddeley, G. Hitch, Working memory, G.H. Bower (Ed.), The
Psychology of Learning and Motivation, vol. 8, Academic Press, New
York, 1974.
[4] V. Bosch, A. Mecklinger, A.D. Friederici, Slow cortical potentials
during retention of object, spatial, and verbal information, Brain Res.
Cogn. Brain Res. 10 (2001) 219–237.
[5] M. Colombo, M.R. D’Amato, H.R. Rodman, C.G. Gross, Auditory
association cortex lesions impair auditory short-term memory in
monkeys, Science 247 (1990) 336–338.
[6] E.M. Conley, H.J. Michalewski, A. Starr, The N100 auditory cortical
evoked potential indexes scanning of auditory short-term memory,
Clin. Neurophysiol. 110 (1999) 2086–2093.
[7] S.M. Courtney, L. Petit, J.V. Haxby, L.G. Ungerleider, The role of
prefrontal cortex in working memory: examining the contents of
consciousness, Philos. Trans. R. Soc. Lond., B Biol. Sci. 353 (1998)
1819–1828.
[8] S. Crottaz-Herbette, R.T. Anagnoson, V. Menon, Modality effects in
verbal working memory: differential prefrontal and parietal responses
to auditory and visual stimuli, NeuroImage 21 (2004) 340–541.
[9] A.J. Golby, R.A. Poldrack, J.B. Brewer, D. Spencer, J.E. Desmond,
A.P. Aron, J.D. Gabrieli, Material-specific lateralization in the medial
temporal lobe and prefrontal cortex during memory encoding, Brain
124 (2001) 1841–1854.
[10] E.J. Golob, A. Starr, Serial position effects in auditory event-related
potentials during working memory retrieval, J. Cogn. Neurosci. 16
(2004) 40–52.
[11] E.J. Golob, A. Starr, Visual encoding differentially affects auditory
event-related potentials during working memory retrieval, Psycho-
physiology 41 (2004) 186–192.
[12] G. Gratton, M.G. Coles, E. Donchin, A new method for off-line
removal of ocular artifact, Electroencephalogr. Clin. Neurophysiol. 55
(1983) 468–484.
[13] J. Jonides, E.E. Smith, The architecture of working memory, in: M.D.
Rugg (Ed.), Cognitive Neuroscience, Psychology Press, Hove, 1997,
pp. 243–276.
[14] W.M. Kelley, F.M. Miezin, K.B. McDermott, R.L. Buckner, M.E.
Raichle, N.J. Cohen, J.M. Ollinger, E. Akbudak, T.E. Conturo, A.Z.
Snyder, S.E. Petersen, Hemispheric specialization in human dorsal
frontal cortex and medial temporal lobe for verbal and nonverbal
memory encoding, Neuron 20 (1998) 927–936.
[15] P. Lovatt, S.E. Avons, J. Masterson, The word-length effect and
disyllabic words, Q. J. Exp. Psychol., A 53 (2000) 1–22.
[16] G.R. Mangun, S.A. Hillyard, Selective attention: mechanisms and
models, in: M.D. Rugg, M.G. Coles (Eds.), Electrophysiology of
Mind, Oxford University Press, 1995.
[17] B. Milner, Interhemispheric differences in the localization of psycho-
logical processes in man, Br. Med. Bull. 27 (1971) 272–277.
[18] B. Milner, Disorders of learning and memory after temporal lobe
lesions in man, Clin. Neurosurg. 19 (1972) 421–446.
[19] D.J. Murray, Articulation and acoustic confusability in short-term
memory, J. Exp. Psychol. (1968) 679–684.
[20] R. Naatanen, T. Picton, The N1 wave of the human electric and
J. Peters et al. / Cognitive Brain Research 25 (2005) 406–415 415
magnetic response to sound: a review and an analysis of the
component structure, Psychophysiology 24 (1987) 375–425.
[21] S. Oray, Z.L. Lu, M.E. Dawson, Modification of sudden onset
auditory ERP by involuntary attention to visual stimuli, Int. J.
Psychophysiol. 43 (2002) 213–224.
[22] C.G. Penney, Modality effects and the structure of short-term verbal
memory, Mem. Cogn. 17 (1989) 398–422.
[23] M. Petrides, B. Milner, Deficits on subject-ordered tasks after
frontal- and temporal-lobe lesions in man, Neuropsychologia 20
(1982) 249–262.
[24] P. Rama, K. Kesseli, K. Reinikainen, J. Kekoni, H. Hamalainen, S.
Carlson, Visuospatial mnemonic load modulates event-related slow
potentials, NeuroReport 8 (1997) 871–876.
[25] D.S. Ruchkin, R.S. Berndt, R. Johnson Jr., W. Ritter, J. Grafman, H.L.
Canoune, Modality-specific processing streams in verbal working
memory: evidence from spatio-temporal patterns of brain activity,
Brain Res. Cogn. Brain Res. 6 (1997) 95–113.
[26] D.S. Ruchkin, J. Grafman, K. Cameron, R.S. Berndt, Working
memory retention systems: a state of activated long-term memory,
Behav. Brain Sci. 26 (2003) 709–728 (discussion 728–77).
[27] E.H. Schumacher, E. Lauber, E. Awh, J. Jonides, E.E. Smith, R.A.
Koeppe, PET evidence for an amodal verbal working memory system,
NeuroImage 3 (1996) 79–88.
[28] E.E. Smith, J. Jonides, Working memory: a view from neuroimaging,
Cogn. Psychol. 33 (1997) 5–42.
[29] J.G. Snodgrass, M. Vanderwart, A standardized set of 260 pictures:
norms for name agreement, image agreement, familiarity, and visual
complexity, J. Exp. Psychol. Hum. Learn. 6 (1980) 174–215.
[30] S. Sternberg, High-speed scanning in human memory, Science 153
(1966) 652–654.
[31] S. Watter, G.M. Geffen, L.B. Geffen, The n-back as a dual-task: P300
morphology under divided attention, Psychophysiology 38 (2001)
998–1003.
[32] M.G. Woldorff, S.A. Hillyard, Modulation of early auditory process-
ing during selective listening to rapidly presented tones, Electro-
encephalogr. Clin. Neurophysiol. 79 (1991) 170–191.