Janata, P. (2001). Brain Electrical Activity Evoked by Mental Formation of Auditory Expectations and...

download Janata, P. (2001). Brain Electrical Activity Evoked by Mental Formation of Auditory Expectations and Images. Brain Topography, 13(3), 169-193.

of 26

Transcript of Janata, P. (2001). Brain Electrical Activity Evoked by Mental Formation of Auditory Expectations and...

  • 7/27/2019 Janata, P. (2001). Brain Electrical Activity Evoked by Mental Formation of Auditory Expectations and Images. Brain

    1/26

    Brain Electrical Activity Evoked by Mental Formation of

    Auditory Expectations and Images

    Petr Janata*

    Summary: Evidencefor the brainsderivationof explicit expectancies in an ongoing sensorycontext hasbeen wellestablished by studies of the P300andprocessingnegativity(PN) componentsof the event-relatedpotential(ERP). "Emitted potentials"generatedin theabsence of sensoryinputbyun-expected stimulus omissions also exhibit a P300 component andprovide anotherperspective on patterns of brain activity related to the processingofexpectancies. The studies described herein extendearlier emittedpotential findings in severalaspects. First,high-density(128-channel) EEG record-ings areused fortopographical mapping of emittedpotentials. Second,the primary focus is on emitted potential componentsprecedingthe P300, i.e.those componentsthat aremore likelyto resemble ERPcomponentsassociated with sensoryprocessing. Third, thedependenceof emittedpotentialson attention is assessed. Fourth,subjects knowledge of the structureof an auditory stimulus sequence is modulatedso that emittedpotentials canbe

    compared between conditions that are identical in physical aspects but differ in terms of subjects expectations regarding the sequence structure.Finally, a novel task is used to elicit emittedpotentials,in which subjects explicitlyimagine thecontinuationsof simple melodies. In this task, subjectsmentally completemelodic fragments in the appropriate tempo, eventhoughthey know withabsolute certainty that no sensory stimulus willoccur.Emittedpotentials were elicitedonly whensubjectsactively formedexpectationsor images. Thetopographiesof theinitial portion of the emittedpo-tentials were significantly correlatedwith theN100 topography elicited by corresponding acoustic stimuli, butuncorrelatedwith the topographies ofcorresponding silence control periods.

    Key words: Emitted potential; Imagery; ERP; N100; High-density EEG; Superior temporal gyrus.

    Introduction

    Mental images and expectancies shape our percep-tion of the world. Auditory expectations play an impor-tant role in the way we perceive incoming auditoryinformation - in other words, what we expect to hear af-fects what we actually hear. For instance, auditory expec-tations allow us to restore the percept of a portion of a

    speech sound that has been masked partially by noise(Samuel 1996; Samuel 1981a; Samuel 1981b; Warren 1970;Warren 1984), and they assist us in tracking a singlespeakers voice at a cocktail party (Cherry 1953). From a

    theoretical perspective, "adaptive resonance" (Grossberg1980) and"reentrant"(Edelman 1989)modelsproposethatinteractions of incoming sensory information andexperientially derived expectations generate dynamic ac-tivity patterns within networks of reciprocal cortico-cortical, cortico-thalamic, and cortico-limbic connections.Unfortunately, very little is known about the actualneurophysiological mechanisms underlying the interac-tions of internally generated images and expectationswith representations of sensory input.

    Event-related brain potentials (ERPs) have been usedextensively tostudy theperceptual andcognitiveprocessesassociated with the processing of discrete external (exoge-

    nous) stimulus events and the modulation of this process-ing by endogenous factors such as attention and variousformsofmemory(Ntnen 1992).One hallmarkof theau-ditoryevokedpotential is theN100, a vertex-negative peakin the ERP waveform occurring approximately 80-120 msfollowingstimulus onset.TheN100 is of particular interestbecause it is influenced by both exogenous and endoge-nous factors (Ntnen and Picton 1987) and is believed toarise in the auditory cortex (Liegeois-Chauvel et al. 1994;

    * Institute of Neuroscience, University of Oregon, Eugene, OR,

    USA.

    Accepted for publication: December 20, 2000.

    Dr. Ramesh Srinivasan kindly provided programs for interpolat-

    ing topographical images. I thank Dr. Michael Posner and Tom Winn

    for helpful discussions, and Drs. Michael Posner, Ramesh Srinivasan,

    Mireille Besson, and several anonymous reviewers for comments on

    earlierversions of thismanuscript. The research wassupported by the

    following NIH grants: Systems Neuroscience Training Grant,

    GM07257, at the University of Oregon , NRSA (NS10395) to P.J.,

    MH42669 to Dr. Don Tucker,and Program Project Grant (NS17778-18)

    at Dartmouth College. Additional support was provided by the

    McDonnell/PewCenterfor theCognitiveNeuroscience of Attention at

    the University of Oregon.

    Correspondence and reprint requests should be addressed to Petr

    Janata, Dept. of Psychological and Brain Sciences, Dartmouth College,

    6207 Moore Hall, Hanover, NH 03755, USA.

    E-mail: [email protected]

    Copyright 2001 Human Sciences Press, Inc.

    Brain Topography, Volume 13, Number 3, 2001 169

  • 7/27/2019 Janata, P. (2001). Brain Electrical Activity Evoked by Mental Formation of Auditory Expectations and Images. Brain

    2/26

    Pantevet al.1995; Verkindt et al.1995). Thus, neuraleventssurrounding theN100 areparticularlyrelevant to studyinginteractions of internally derived expectations or images,and representations of the external environment.

    Attempts to separate endogenous components of theauditory ERP waveform from exogenous components

    usuallyinvolvea comparison of potentials evoked by sen-sorystimulithat havebeen influencedbyendogenousfac-tors to varying degrees. In all such cases, the potentialsmeasuredreflect an interactionof a sensory stimulus withendogenous factors. If theinteractionbegins as early as 50ms (Woldorff and Hillyard 1991), the early interactionspresumablyinfluence later interactions, therebymakingitvery difficult toseparatethoseaspectsof theresponse thataredue purely to top-down influences from those that areduepurely to sensory influences. Veryfewattemptshavebeen made to measure potentialsassociated with endoge-nousprocessesabsent thesensoryinput. Thus, onegoal ofthe experiments described in the present paper is to help

    dissociate the exogenous and endogenous components ofevoked potentials as well as provide another perspectiveonhowtheyinteractastheresponsetoaneventevolves.

    The primary difficulty in measuring responses to in-ternally generated, i.e. imagined or expected, events is dueto the uncertainty in determining the onset times of suchevents and the need to average responses to many suchevents in order to obtain an ERP. Two attempts by Wein-berg and colleagues (1970, 1974) to overcome the problemof estimating when subjects expectations for sensoryevents occurred resulted in the successful recording of"emitted" potentials in the absence of an actual sensory

    event. In one case, potentials were emitted when subjectsdeveloped a temporal expectancy for when one stimuluswould follow another. In theother, subjectsspontaneouslysignaled whether or notthey expected an auditory event tooccur. Ontrials inwhich theyexpectedtoheara sound butnone arrived, an emitted potentialwasgenerated, whereasnone was present in the absence of an expectation.

    Further work on emitted potentials focussed onwhether the late positivity observed in response tounexpected stimulus omissions was the same as the P300observedinresponsetorare ornovelstimuli.Ruchkin etal.(1975) demonstrated a comparable dependence of evokedand emitted P300 amplitude on stimulus probability, and

    subsequently demonstrated that emitted potentials con-tain temporal expectations about the time of stimulus oc-currence (Ruchkin and Sutton 1978). Emitted potentials tounexpected stimulus omissions in musical and linguisticcontexts also include a large P300, consistent with the ideathat expectancies are formed and then violated by theomission (Besson and Fata 1995; Besson et al. 1997).

    In principle, expectations of stimuli could consist ofnot only when the stimulus is expected to occur but alsosome information about its physical properties, e.g. sen-

    sory modality. If there is even a partial dependence ofemitted potentials on the sensory modality of the ex-pected stimulus, the topographical distributions of emit-ted potential features should show a difference acrosssensory modality and may even show similarities to ex-ogenous components of potentials evoked by stimuli

    from different modalities. Simson et al. (1976) reportedthat topographical distributions of early portions ofemitted responses resembled those of their evoked po-tential counterparts in trains of auditory or visual stim-uli. With musical stimuli, the large early components ofemitted potentialsresembletheAEP recorded at midlinesites, though the peaks are somewhat delayed relative topeaks in response to audible stimuli (Besson and Fata1995; Besson et al. 1997). One impetus for the presentstudy were the results of an earlier ERP study of musicalexpectancywhichshowedN1/P2andsustainedpositivewaveforms at central sites during a period of silence inwhich participants were to imagine the bestpossible res-

    olution to a sequence of chords prior to the occurrence ofthe actual resolution (Janata 1995). In this earlier study,however, the possibility of an N1/P2 waveform index-ing auditory imagery was confounded with a possibleoffset response to the chord immediately preceding theimagery period. Thus, the present experiments were de-signed to eliminate contamination by offset responses.

    A prevalent, yet untested, assumption is that emittedpotentials are automatically elicited by stimulus omis-sions, and that they can be generated only by unexpectedomissions. In the auditory ERP literature, a very strongdistinction has been drawn between attentionally main-

    tained expectations, e.g. for targets in a target detectiontask, andautomaticexpectationsformedon thebasis of re-peated sensory input (Ntnen 1990; Ntnen 1992).Thelattertype of expectation is notinfluenced byattentionand is indexed by the mismatch negativity (MMN). TheMMNappears to begeneratedbystimulus omissions onlywhen thestimulusonsetasynchrony (SOA) is lessthan150ms (Tervaniemi et al. 1994). Thus the same automatic pro-cesses that are postulated for the MMN cannot readily ex-plain the bulk of the emitted potential data. Omissions ofregularly spaced auditory stimuli do, however, elicit re-sponses during the non-attentive condition of silentlyreading a book (Joutsiniemi andHari 1989; Raij et al.1997).

    Although the book reading task is used as a condition ofauditory inattention in MMN studies, the authors of thesepapers argue that the emitted responses do not representthesame generatorsas theMMNbecause theamplitude ofthe emitted responses depends on attention.

    Overall, the question of whether emitted potentialsreflect an automatic response remains in need of testing.Thus, one goal of the two experiments described in thispaper was to test whether emitted potentials really repre-sent responses to violations of expectancies generated au-

    170 Janata

  • 7/27/2019 Janata, P. (2001). Brain Electrical Activity Evoked by Mental Formation of Auditory Expectations and Images. Brain

    3/26

    tomatically over the course of a sequence of repeatingsounds, or whether they depend on voluntarily formedmental images. More importantly, the present studiesseekto establish whetheremittedpotentials aregeneratedin a situation that does not involve unexpected stimulusomissions. Because the stimulus omissions in the studies

    described above were unexpected, the emitted potentialsall reflect a violation of an expectancy. Will potentials beemitted following the end of a note sequence even whenthesubjectknows thatno stimulus willoccur? Finally, thesecond experiment compared emitted responses gener-ated under task conditions that resembled the traditionalemitted potential paradigm with potentials emitted assubjects explicitly imagined a continuation of a simplemelodicphrase. As such, theexperimentsrepresent a pre-liminary comparison of imagery and expectancy forma-tion. Inbothstudies, the primaryfocus of theanalysis wason the early stages (0-200 ms) of the emitted potential re-sponse rather than the P300. Specific comparisons were

    made between the topographies associated with N100 re-sponses to heard notes, and the topographies to corre-sponding imagined notes and expected and unexpectedsilences during the same time windows.

    Experiment 1

    Methods

    Participants & Stimuli

    Seven subjects engaged in the experiment, andwere

    compensated $5/hr. The age range of participants was18-42 years (mean = 26.4). The range of experience play-ing a musical instrument was 1037 years (mean = 20.7years). All subjects provided informed consent and pro-cedures were approved by the Institutional ReviewBoard at the University of Oregon.

    Individual notes used forascendinganddescendingmelodic phrases were constructed in Matlab (TheMathworks, Natick, MA) using a sampling rate of 22.255kHz. Fundamental frequencies corresponding to thenotes of a D-major scale starting at 246.94 Hz (D2) wereused. Eachnote was a harmonic sound consisting of har-monics 16 of its fundamental frequency. The relative

    amplitudes of the harmonics were set at 1.0, 0.8, 0.7, 0.6,0.5, and 0.4. Individual notes were 250 ms in durationand ramped on and off with 2.5 ms linear ramps. The re-sultingtimbre resembledsmoothnotes created ona brassinstrument. Individual notes were converted to 16-bitAIFF format using the "sfconvert" utility on an Indigo 2workstation (Silicon Graphics, Mountain View, CA) andarranged into melodies using SoundEdit16 on an AppleMacintosh (Apple, Cupertino, CA). The stimuli werestored as snd resources for use by the stimulus presen-

    tation/data acquisition program (EGIS, Univ. of Ore-gon). Stimuli were presented (~65 dB SPL) from anApple Macintosh Power PC 8100 via a mixer (RadioShack, Ft. Worth, TX), amplifier (Marantz, Roselle, IL)and two speakers (Miller & Kreisel, Culver City, CA) sit-uated 1 m apart on either side of a computer monitor

    about 2 m in front of the subject.

    Procedures

    One ascending and one descending melodic phrase(figure 1A) were used in the experiment. Each melodywas presented an equal number of times (50 trials/mel-ody). Assignment of melodies across the trials was ran-dom. The experiment consisted of four blocks of 50 trialseach. The 1st and 3rd blocks were "active" blocks inwhich subjects were required to attend to the melodies,make key presses, and generate images. The 2nd and 4thblocks did not require that images be generated, and sub-jects were free to ignore the melodies.

    At the start of each trial, a number appeared on thecomputer monitor in front of the subject indicating thenumber of trials remaining in the block. The subject initi-ated the trial at his/her leisure by pressing a lever. Thenumber disappeared and a fixation point appeared at thecenterof the screen. The melodycommenced 500 ms aftertheappearance of thefixationpoint.Each trial in an activeblock consisted of three sub-sections. In the firstsub-section of the trial, subjects heard the entire 8-notemelody and were asked to press the lever synchronouslywiththeonsetofthelastnote,asthoughtheywereplaying

    an instrument and were to join in with the melody at thetime of the last note ["All Heard" (AH) condition, figure1B]. The SOA between notes in the melody was 500 ms.Thefixationpoint disappeared750msfollowing theoffsetof the 8th note in the sequence. A 2 s pause preceded theappearance of the next fixation point which signaled thebeginningof the second sub-section. Five hundredms af-ter the appearance of the fixationpoint, subjects heard theinitial five notes of the same melody and were asked tocontinue imagining the final three notes, pressing the le-ver at the time they thought the onset of the last notewould have occurred ["3-Imagined" (3I) condition, figure1B]. The fixation point disappeared 750 ms after the time

    when the last note would have ceased. After a 2 s pauseand subsequent appearance of the fixation point, subjectsheard the initial three notes of the melody and were tocontinue imagining the remaining five, pressing the leverwhen they thought the last note would have occurred["5-Imagined" (5I) condition, figure 1B].

    In contrast to the "active" blocks in which subjectswere imagining notes and making timing judgements,trials in "passive" blocks consisted of only onesub-section per trial. The fixation point and timing pa-

    Electrophysiology of Mental Image Formation 171

  • 7/27/2019 Janata, P. (2001). Brain Electrical Activity Evoked by Mental Formation of Auditory Expectations and Images. Brain

    4/26

    172 Janata

    Figure1. Melodiesand experimental conditions used in Experiment 1. A) Thetwo simplemelodies in musical notation. B)Schematic diagram of the imagery trials. In the "All Heard" condition subjects heard the entire melodic phrase. In the "3Imagined condition," the initial five notes were heard and the remaining three imagined. In the "5 Imagined" condition,subjectsheard theinitial threenotes andimaginedtheremainingfive. On each trial, these three conditions appearedin

    immediate succession. In separate blocks of "No Imagery" trials, subjects heard the initial five notes but did not imaginethe remaining three. C) Microphone recordings of the stimulus sequences during one trial taken from the position of thesubjects head. Note that the sound from each note decayed fully prior to the onset of the following note.

  • 7/27/2019 Janata, P. (2001). Brain Electrical Activity Evoked by Mental Formation of Auditory Expectations and Images. Brain

    5/26

    rameters were thesame as the3I condition in the"active"blocks. However, on any given trial, subjects heard onlythe initial five notes of the melody. There were no re-sponse requirements and subjects were asked to simplyignore the melodies andlet them pass by ["No-Imagery"(NI) condition, figure 1B]. It was suggested to them that

    if they didfindthemselves listeningto themelodies, thatthey should hear them as a complete phrase and not toimagine their continuation. The melodies were com-posed such that the five note fragments could be heardas complete motifs. In general, subjects reported no dif-ficultyin either ignoringthe melodiesaltogether or hear-ing them as complete motifs, although 4/7 of thesubjects reported imagining the continuation or repeat-ing the last note on some of the trials. The NI trials werepresented in immediate succession, rather than inter-leaved with the imagery trials. The purpose of these tri-als was to test whether emitted potentials are elicitedautomatically when sequences of notes are not actively

    attended to. We felt the likelihood they would be ig-nored was greatest if they were presented in blocks.

    Prior to the collection of electrophysiological data,subjects received practice trials until their performanceon the timed judgements was consistent and they feltcomfortable with the task (typically515 trials). Subjectswere encouraged to pause and make whatever move-ments they needed to make between trials in the "active"blocks. It was emphasized to them that their imaging ofthe melodies should be completely internal with no ex-ternal movements (finger or toe-tapping) or vocaliza-tions (includingsub-vocalizations) to aidin keepingtime

    or imagining the pitches.The electrophysiological recording sessiontypicallylasted 90 minutes. Following the experiment, subjectsfilled out a short questionnaire with questions abouttheir musical background and subjective evaluation oftheir performance on the task. Subjects were asked torate on a scale of 1-7 the difficulty of imagining the notes(1=very difficult; 7=very easy), and the ease with whichthey could maintain thetempo/rhythm while imaginingthe notes (1=very difficult; 7=very easy).

    Electrophysiological recording and data analysis

    The EEG, recorded from 128 scalp electrodes refer-enced to an electrode at the vertex (Cz) using a GeodesicElectrode Net (Electrical Geodesics, Inc., Eugene, OR),was low-pass filtered (50 Hz) and digitized (12-bit preci-sion)at 125samples/second. Electrodeimpedanceswerein the range of 50 kOhms. All voltage values are ex-pressed relative to theaverage voltage acrossall sites (av-erage-reference)(Tucker 1993). In each sub-trial, EEGrecording began 500 ms prior to the onset of the first notein the melody and continued for 5 s.

    All analyses of the data were performed usingMatlab (Mathworks, Natick, MA). Data were analyzedoffline for movement artifacts. For any given trial, achannels data were rejected if any voltage value in aspecified time window exceeded 50 V, the standard de-viationofthechannelsvaluesexceeded20Voranysin-

    gle value was 6 times the standard deviation of thevaluesfor thatchannel onthat trial. Ifmorethan12 chan-nels(9.3% of allchannels) were rejected fortheparticulartrial, the entire trial was rejected from further analysis.Thetimewindowselected forartifact rejection bracketedthe events of most interest and began with the onset ofthe 3rd heard note and ended 250 ms after the heard/ex-pected offset of the 8th note. Each channels values werebaseline corrected on each epoch by removing the lineartrend from the entire epoch.

    For each subject, the mean key-press timesand stan-dard deviations relative to the onsetof the last note in themelody or the predicted onset of the last imagined note

    were determined for the AH condition and each of theimagined melody conditions, respectively. A single fac-tor repeated-measures analysis of variance (ANOVA)was performed on these values.

    Time-domain averages were constructed for each ofthe experimentalconditions by averaging all artifact-freetrials for each subject. Grand-average waveforms werecomputed by averaging the individual subjects averagewaveforms. In order to illustrate how the patterns of thepotential distributionsevolved through timewithineachof the tasks, interpolated images were created based onthe data from 111 of the 129 electrodesusing a 3-D spline

    algorithm (Srinivasan et al. 1996). The interpolationswere truncated slightly before the outer ring of elec-trodes because interpolations at the outer edge weremore likely to be inaccurate due to lack of empirical databeyond the outer edge of electrodes. In addition, record-ingsfrom theouter ring of electrodes,particularlyat pos-terior sites, tended to be noisy due to poor electrode netfits. Theouter ringof electrodes on theGeodesicNets ex-tends approximately 3 cm beyond the outer ring of the10-20 electrode placement standard.

    The goal of this experiment was to see if the patternsof brain electrical activity, as manifested in the scalp volt-age distributions, were similar under active perceptive

    (AH) andimaginative(3I, 5I)task conditions anddifferentfrom those obtained in the passive perceptive (NI) condi-tion. Topography similarity was assessed by correlatingthe topographies observed during corresponding epochsin the different task conditions, and by a permutationbootstrapproceduredevelopedfor comparingEEG topo-graphical maps (Galan et al. 1997). Briefly, to comparetwotopographical mapsusing thisprocedure, thevoltagevalues in each map are rank ordered, keeping track of theassociated electrode positions. TheEuclideandistance be-

    Electrophysiology of Mental Image Formation 173

  • 7/27/2019 Janata, P. (2001). Brain Electrical Activity Evoked by Mental Formation of Auditory Expectations and Images. Brain

    6/26

    tween electrode pairs of equivalent rank is computed foreach electrode pair. The sum of distances provides a sta-tistic,D. The likelihood of obtaining theobserved valueofD given the topographical maps is assessed by compari-son with a distribution of the D statistic derived from themeasured data. A reference sample for the distribution iscreated by switching voltage values between maps at arandom selection of 50% of the electrode sites and com-puting D for the mixed maps. We created D distributionsusing 1000 iterations of this procedure.

    Results

    Behavior

    Participants reported having no trouble imaginingthe notes in the melody. The mean rating for the ease ofimagining the notes was 6.3 0.8, and 6 1 for ease ofmaintaining the tempo/rhythm. Four of the subjects re-ported inadvertently continuing to imagine the melodyon some of the trials in the NI condition. Fourof the sub-jects reported a mixed strategy for imagining the notes.On some trials they imagined pure pitches whereas onother trials, they mentallysang a syllable with each pitchin the melody. One subject reported using primarily la-bels and two reported forming mostly pure images.

    In order to estimate subjects mental timing of theirmusical images, they were askedto press a lever synchro-nously with the onset of the last note in the complete mel-ody condition andsynchronously with themoment when

    they expected the onsets of the last notes in the two imag-ined melody conditions. The cumulative response timedistributions across subjects are shown as histograms infigure 2. Times are given relative to the onset of the 8thnote in the melody. An ANOVA was performed to testdifferences in the mean key press times and standard de-viations of these times across the three experimental con-ditions in which subjects were required to make a keypress. A highly significant effect was found for both themean and std. dev. [F(12,18)mean = 25.34, p < 0.0001;F(2,18)std.dev. = 36.27, p < 0.0001]. Post-hoc Scheff com-parisons( = 0.05) revealedthat the meandelay inthe AHconditionof 47mswas significantlyshorterthanthe mean

    delays in the 3I and 5I conditions (248 ms and 277 ms, re-spectively). The mean delays in the 3I and 5I conditionswere not significantly different from each other. Thepost-hoc comparisons indicated that the standard devia-tions in the response distributions differed significantlyacross the experimental conditions. The mean standarddeviationincreased from 70ms intheAH condition to113ms and 173 ms in the 3I and 5I conditions, respectively.

    Not suprisingly, the variability in subjects re-sponses increased as the time between external timingcues provided by the melody and the target responsetimeincreased. The lever-press in the AH condition pro-vided a baseline measure of the variability in the execu-tion of a timedmotor act following relative quiescence ofmotor effector systems but in full presence of externaltiming cues. Subjects were explicitly instructed to makeno movements that would assist them in keeping time.Two subjects reported that making the key press syn-chronously with the final note interfered with their abil-ity to imagine it, and three subjects commented thatmaking the key press was somewhat of a distraction.

    174 Janata

    Figure 2. Cumulative distribution of response times fromthe 7 subjects. The abscissa is expressed in ms relative tothescheduled onset of the8th note. Responses were col-

    lapsedinto 5 msbins. The meansand standard deviationsof the plotted distributions are indicated in the upper

    right-hand corner of each plot. A) AH condition in whichsubjects heard 8 notes and were to press a lever synchro-

    nously with the onset of the 8th note. B) 3I condition inwhich subjects heard 5 notes and proceeded to imagine3, making a lever press at the time they would have ex-pected the 8th note to have its onset. C) Same as B, ex-cept subjects heard 3 notes and imagined 5 (5I

    condition).

  • 7/27/2019 Janata, P. (2001). Brain Electrical Activity Evoked by Mental Formation of Auditory Expectations and Images. Brain

    7/26

    Electrophysiology

    The principal aim of this study was to compare thetopographies of evoked potentials generated by heardmusical events in simple musical phrases with theirimagined counterparts in the same, butshortened, musi-cal phrases which the subjects had to complete mentally.The acoustical conditions in each of the task conditionsare illustrated in figure 1C by microphone recordings ofthe melodies made from the position at which subjectswere seated. Therecordings illustrate that each note wasclearly separated from the previous one and that therewere no sharp offsets or clicks as the note decayed thatmight have resulted in evoked potentials induced byacoustic artifacts during times when subjects were re-quired to imagine notes in the melody.

    While the topography time-series is rich in informa-tion, the display of many voltage maps in many condi-tions becomes rather unwieldy. For this reason, most of

    the topographical maps depicted in the figures are to-pographies of voltages averaged over time windows aslong as 96 ms. Figure 3 illustrates that the averaging pro-cedure effectively reduces the number of maps that needto be displayed without eliminating or blurring promi-nent topographical states. The voltages recorded in re-sponse to the first of five imagined notes are shownaveraged over 16 ms (2 sample), 64 ms (8 sample), and128 ms (16 sample) windows (figure 3A-C, respectively).Each colored circle represents a view down onto the topof the head (the nose would be at the top of the circle).The primary topography during the initial 256 ms ischaracterized by a negative focus centrally and positivering around inferior peripheral sites. Similarly, both thepositive centro-parietal focus between 384 and 576 msand the frontal positive focus from 705 to 832 ms arereadily captured across all scales. What suffer most inthe 128 ms averages are the transitions among topo-graphical states, e.g.512-768 ms in figure3A. As thepari-etal-positive focus decays, the frontal positive focusgains in strength. The dashed lines between figure 3Aand 3B, and 3B and 3C bracket the corresponding seg-ments of maps. In figure 3C, the entire transition is sum-marized in the two maps between 512 and 768 ms.Presented only with these two maps, it is obvious that a

    transition occurs, but it is not possible to discern howsmooth or abrupt it is. Nonetheless, the more sustainedtopographical distributions are wellpreserved. As theseform theprincipal focusof this paper, topographies aver-agedoverdurationsofupto100msareconsideredanap-propriate amount of data compression for reducing thecomplexity of the figures while retaining the prevalenttopographical states and transitions.

    Figure4 illustrates thedynamics in thegrand-averagescalp topographies in each of the task conditions. Because

    the events in each trial spanned five seconds, the series oftopographies is divided into five panels. Within eachpanel (figure 4A-E), the topographical maps from all fourexperimentalconditions arerepresentedforthe purposeofcomparison. Every topographical image in the figure rep-resents the average of 12 consecutive time points (96 ms

    window), with no overlap between successive windows.Each panel is divided by a vertical dashed line into two,500 ms segments, and each segment corresponds to a sin-glestimulusevent. Thestimulus event labelsareto theleftof each row of images.

    In addition to showing the topography differencesamong conditions(rows within each panel) at analogouspositions within the melodies, figure 4 illustrates thescalp topographies associated with different ERP com-ponents,suchastheN100,P200,andP300,andthetransi-tions between them. First, the topographies associatedwith the stimulus events of most interest are describedqualitatively. A statistical comparison of the topogra-phies in the different conditions follows in the section,"Correlations of the potential distributions between taskconditions" and figure 6.

    Responses to heard notes

    The acoustic stimulation was identical across taskconditions in theepochs representedin figure4A and4B.Not surprisingly, during the 500 ms prior to the onset ofthemelody, thetopographies observed in thefour condi-tions were very similar. This similarity persistedthroughout the1stand2nd heard notes.Theresponses to

    the 1st heard note were characterized by a topography inwhich the central sites became more negative and theperimeter became more positive. The responses to the2nd and 3rd heard notes better illustrate the topographi-cal distribution that is characteristic of the N100 compo-nent of the ERP, namely a centro-frontal negative focusand positive voltage distribution at inferior sites. Thepositive peak in response to the 1st heard note had a dis-tinct topographythat wascharacteristic of theP200: posi-tive at central and frontal sites and negative about theperimeter (figure 4A, 2nd segment, window centered on750 ms, 250 ms after the onset of the 1st heard note). To-gether, these windows illustrate the topography of the

    negativepeak (N100)to positive peak(P200) transitionofthe auditory evoked potential (AEP) (see also figure 8 -inset). The topographies shown in response to otherheard notes across the conditionsshow that the N100 ep-och wastypically characterized by a centro-frontal nega-tive distribution and positive perimeter, while the P200tended to be represented by a frontal and/or more dif-fuse positivity. Particularly in the imagery conditions(2nd and 3rd rows within each panel), the topographyfollowing the P200 returned to that of the N100 topogra-

    Electrophysiology of Mental Image Formation 175

  • 7/27/2019 Janata, P. (2001). Brain Electrical Activity Evoked by Mental Formation of Auditory Expectations and Images. Brain

    8/26

    176 Janata

    Figure 3. See caption at end of paper.

  • 7/27/2019 Janata, P. (2001). Brain Electrical Activity Evoked by Mental Formation of Auditory Expectations and Images. Brain

    9/26

    Electrophysiology of Mental Image Formation 177

    Figure 4. Time-series of the scalp topographies recorded in the four conditions in Experiment 1. The color scales in eachpanel arescaledto 85%of the largest absolute value occurringduringtheepochrepresented in each panel. Each inter-

    polation is averagedacross12 datasamples (96ms total)and sevensubjects. Thefive panels of interpolations aredividedinto two500ms segments by a dashedline. Each segment corresponds to an event in thetask. Theevents correspondingto each rowand segment arespecified in thepanels to theleft. Withineach panel,from topto bottom, therowsof circlescorresponds to theAH, 3I,5I,andNI conditions, respectively. Stimulus conditions areidentical across thefour conditions inpanels A & B, but they begin to differ in panel C.

  • 7/27/2019 Janata, P. (2001). Brain Electrical Activity Evoked by Mental Formation of Auditory Expectations and Images. Brain

    10/26

    phy, especially in response to the last note preceding theimagination of the remainder of the melody (figure 4B,2nd segment, 3rd row from the top; figure 4C, 2nd seg-ment, 2nd row from the top).

    Responses to the 1st and 2nd imagined note

    Ofprimary interest in this studywere theresponsestoimagined notes. The responses to the first two imaginednotes were similar in the two imagery conditions (figure4C, 3rd row from top; 4D, 2nd row from top, also summa-rized in figure 5). The topography established during thelast 200 ms of the heard note epoch preceding the firstimagined note was sustained for 300 ms into the 1st imag-ined note epoch. This topography was similar to the N100distribution forheardevents, butwith anenhanced frontalpositivity (figure 4C & D, window centered on 150 ms).When the task did not involve imagery, the voltage distri-bution generated under identical acoustical conditions

    was markedly different (cf. figure 4D, rows 2 & 4).In the imagery conditions, the potential distribution

    transitioned to a state characterized by a strongcentro-parietal positivity at around 350 ms after the pre-dicted onset of the 1st imagined note (figure 4C, row 3;figure 4D, row 2). Between 150 and 250 ms after the pre-dictedonset of the 2nd imagined note, there was a transi-tion in the topography of the potential distribution fromthe centro-parietal positive distribution to a centro-frontal positive distribution (figure 4C, row 3; figure 4D,row 2 at 650-750 ms). This transition occurred in both ofthe imagery conditions, but was entirely absent from theNI and AH melody conditions.

    The main topographical states assumed in responsetothe 1st imaginednotesaresummarized infigure5. Alsoshown arethetopographiesfrom theequivalent timewin-dows during which a subjectwas hearinga note in the AHcondition and when the subject was not imagining any-thing in the NI condition. The topographical maps in theimagery conditions differ from the maps in the NI condi-tion, both with regard to the central negativity in the96192 ms window (figure5D,5G) andthecentro-parietalpositivity in the 376480 ms window (figure 5E, 5H). Incontrast, the maps in the 96192 ms window in the imag-ery conditions are similar to the map corresponding tosubjects hearing the 6th note (figure 5A). The map in fig-ure5Cshows theresponsewhen subjects were hearingthe7th note in the melody and preparing to make a key presssynchronously with the onset of the 8th note.

    Responses to subsequent imagined notes

    Following the 2nd imagined note, the tasks and con-sequently the responses in each of the two imagery con-ditions differed. In the 3I condition, the potentialdistributions that were observed during the 2nd imag-

    ined note resembled those in the AH condition prior tothe key press (compares rows 1 and 2 in figure 4D seg-ment 2, and figure 4E segment 1). In the 5I condition, thefrontally positive potential distribution that was estab-lished during the 2nd imagined note persisted until thepredicted onset of the 4th imagined note (row 3, 4C seg-

    ment 2, and 4D segment 1), whereupon it changed andultimately assumed the potential distribution that ap-peared to characterize preparation and execution of akey press synchronously with the final imagined note.

    Responses to preparation and execution of a key press

    The timed key presses that were required in the AHandimageryconditionsgave risetosequences of potentialdistributions which were entirely absent in the NI condi-tion whichdid notrequirea keypress(figure 4D, 2nd seg-ment; figure 4E). A detail of the topographies is shownalongside the response time distributions in figure 3DF.

    The sequence is best illustrated in the AH condition inwhich subjects were required to make a key press syn-chronously with the final (8th) heard note. In the 300 mspreceding the onset of the 8th note, a potential distributionwas establishedthat was strongly negativeat central sitesand strongly positive frontally and around the perimeter(figure 3D). Shortly after the mean key-press time therewas an inversion in the polarity of this topography. Thesame general pattern of a central negativity and periph-eral positivity that transitioned to a central positivityshortly after the average key press time was observed inboth imagery conditions (figure 3E,3F). The delay in thetopography transition in the imagery conditions com-

    pared to the complete melody condition followed the dif-ferences seen in the response time distributions.

    Correlations of the potential distributions between taskconditions

    The similarities and differences across potential dis-tributions shown in figure 4 were assessed statistically bycomputing the correlation coefficients between the aver-age topographies in the 3I condition and every other con-dition for each 96 ms window. Correlation coefficientswere computed individually for each subject, and the cor-relation coefficients at each time window were compared

    with a two factor (window X comparison) repeated-measures ANOVA, where the "comparison" refers to thecorrelation between the 3I condition and each of the otherconditions. The ANOVA showed a significant main effectof window (F[19,114]=4.53, p < 0.0001), comparison(F[2,12]=12.41, p < 0.005), and a significant interaction be-tween the two factors (F[38,228]=2.17, p

  • 7/27/2019 Janata, P. (2001). Brain Electrical Activity Evoked by Mental Formation of Auditory Expectations and Images. Brain

    11/26

    Electrophysiology of Mental Image Formation 179

    Figure5. Summary of key topographical states in theimagery task. A) Average topography duringthe period of theN100in response to the 6th note in the melody. B) Topography between the offset of the 6th heard note and onset of the 7th

    heardnote. C) Topography after theonset of 7thheard note andat the beginning of preparations to make a keypress.D, E) Average response to 1st imagined note in the3I condition. F) Topography at thetime the2nd note wasto be imag-ined and preparationforkey press shouldhave commenced. G, H) Average responseto 1stimaginednote in the5I con-dition. I) Topography at the time the 2nd note (of 5) was to be imagined. In this condition, preparation for the key presswasstill1 s away. J,K,L) Responsein theno-imagery condition. Thisconditionwasacoustically identicalto the3I condition

    shown in the second row (D-F). For more detail about the transitions linking the topographies shown here, see figure 4.

  • 7/27/2019 Janata, P. (2001). Brain Electrical Activity Evoked by Mental Formation of Auditory Expectations and Images. Brain

    12/26

    tions among conditions are marked with an asterisk. Thecorrelation coefficients among the topographies resultingfrom identical acoustical stimulation during the first andsecond heard notes of the melody are shown in figure 6A.As expected, a repeated-measures ANOVA on only thesetime windows showed that for none of the time windows

    did the correlations differ significantly among compari-sons (F[2,12]=3.31, n.s.), indicating that the auditory re-sponses to the initial two notes of the melody in the 3Iconditionwere equallysimilar totheresponsestothesameacoustic events in each of the other conditions.

    During theimagery portions of the trials (figure 6B),the correlations between the topographies evoked underthe various task conditions differed significantly(F[2,12]=13.79, p < 0.001). The solid bars indicate that thecorrelationbetween topographies elicitedin the imageryconditions during imagery of the 1st and 2nd imaginednotes remained high. In contrast, the topographies fromthe acoustically identical 3I and NI conditions were

    uncorrelated. Correlations with the AH conditiontended to assume intermediate values.

    The similarity of topographies in the N100 time win-dow (96-112 ms) was further assessed across conditionsusing the D statistic and permutation technique describedabove. All pairwise comparisons between the conditionswere computed. Thetopography of theN100 fortheheardcondition did not differ significantly from the topogra-phies during the corresponding time window in either ofthe imagery conditions, but it did differ significantly(p

  • 7/27/2019 Janata, P. (2001). Brain Electrical Activity Evoked by Mental Formation of Auditory Expectations and Images. Brain

    13/26

    Electrophysiology of Mental Image Formation 181

    Figure 6. Time-series of the correlations among the experimental conditions. For each subject, the averagescalp-topographies in successive 96 ms windows (such as those shown in figure 4) from the 3I condition were correlatedwith corresponding topographies in the other task conditions. The correlations between conditions (averaged acrosssubjects) are plotted in A & B. Error bars indicate the standard error of the mean. A) Black bars show the correlation ofevokedresponses duringthe1stand2ndheardnotes in the3I and5I task conditions. The hatched bars show thecorrela-

    tion between the3I andAH conditions duringthe1st and2nd heard notes. The white bars shows thecorrelationbetweenthe3I andNI conditions duringthe1st and 2ndheardnotes. Theacoustic stimulation was identical across task conditionsduring this 1 s epoch. The correlations are comparisons of the topographical images from the 2nd segment in figure 4Aand 1st segment in figure 4B. B) The black bars show the correlation between the 3I and 5I conditions during the 1st and

    2nd imagined notes in each condition. Thus, the correlations are a comparison of the topographical images in the 3rdrow fromthe top in figure 4C and the images in the 2nd row from the top in figure 4D. The hatched bars show the correla-tion betweenthe 1stand 2ndimagined notes in the3I conditionand the6thand7thheardnotes in theAH condition(top2rows in figure 4D). The white bars show the comparison of the 1st and 2nd imagined notes in the 3I condition and corre-sponding silence in theNI condition(2nd and 4throws from thetop in figure4D). Asterisksindicate windows in which there

    were significant differences in the correlation coefficients between conditions.

  • 7/27/2019 Janata, P. (2001). Brain Electrical Activity Evoked by Mental Formation of Auditory Expectations and Images. Brain

    14/26

    specific moment in time in the completeabsence of an ex-pectation that an actual external stimulus would occur.Thus, these results suggest that it is possible to use ERPsto isolate the brain activations underlying mental imageformation as an act unto itself, absent from the expecta-tion that another stimulus will immediately follow.

    However, the absence of an emitted potential in theNI condition could, in principle, be attributed to theblock design of the experiment, in which subjects couldcompletely ignore the stimuli if they so desired. In otherwords, attentional load was not well-matched betweenthe imagery and no-imagery conditions. While it wasunlikely that they completely ignored the stimuli, giventheir reports that they would on occasion inadvertentlyimagine a note following the partial sequence, we de-cidedto test additional subjects in an experiment that re-quired attention and expectancy formation in bothimagery and no-imagery conditions.

    Experiment 2

    The purpose of this experiment was to replicate thefindings of Experiment 1 and to more fully explore therange of task conditions under which emitted potentialsmight be generated. In order to better match theattentional demands on the participants across the imag-ery and non-imagery trials, a task was devised in whichsubjects had to make a simple judgment about the samenote sequences thatwereusedinthe imagery trials but thejudgmentdid not rely on maintenance of auditoryimagesin working memory. In the"CueValidity" task(described

    in moredetail below),participantsreceived trials in whichthey 1) expected 3 notes and heard 3 notes, 2) expected 5notesand heard5 notes,3) expected5 notesbut heardonly3, and 4) expected 3 notes but heard 5. Conditions 1 and 2allow one to look for emitted potentials following the ter-mination of a sequence when subjects are attentive butknow thesequence will endat a giventime. Condition3 isanalogous to theemitted potentialstudies abovein whichthe sudden cessation of the sequence constitutes an unex-pected omission. Condition 4 looks at responses to unex-pected continuations. To our knowledge, this type ofresponse has not been investigated previously.

    Methods

    Participants & Stimuli

    Data were collected for three participants in this ex-periment. Participants received $5/hr. Oneof thepartic-ipants had been a participant in Experiment 1 also. Therange of experience playing a musical instrument was2230 years (mean = 25.3years). The age range ofthe par-ticipants was 32-40 years (mean = 34). All subjects pro-

    vided informed consent and procedures were approvedby the Institutional Review Board at the University of Or-egon. The acoustic materials were the same as those inExperiment 1. Stimuli were presented via Sony ear-phones positioned in the subjects outer ears.

    Procedures

    This experiment consisted of two trial types: "5Imagined Notes (5I)" and"Cue Validity" (CV) judgments(figure 7). In contrast to the NI trials of the first experi-ment, the no-imagery (CV) trials in this experiment re-quired the subject to maintain attention and make ajudgment about the number of notes they heard. The 5Iand CV trials were randomly interleaved. The melodieswere the same as those used in Experiment 1. In contrastto Experiment 1, subjects did not hear the complete mel-ody condition (AH) on each trial. The AH condition wasomitted from the present experiment given the relative

    easewithwhichparticipants performedtheimagery taskin the first experiment and the need to keep the overallduration of the experiment within reason. With onlythenecessary conditions described below, the overall dura-tion of the experiment, including electrode applicationand post-experiment proceedings, was 2 hours in dura-tion. Prior to the experiment, we played the two melo-dies to the subjects until they were confident they couldreproduce them in their minds. Overall, the subjects re-ported more difficulty (mean ease of imagery rating of4.5) in imagining the notes than did the subjects in thefirstexperiment (mean ease of imagery rating of 6.3). For

    this reason, two of the three subjects were given the op-portunity to refreshtheirmemoryof themelodies duringbreaks that occurred every 10 trials.

    At the beginningof each trial, thewords "Imagine"or"Do Not Imagine" appeared on the screen to indicate thetrial type. In the Imagine condition (figure 7A), the firstnote was played1250ms followingonsetof the visual cue.As in Experiment 1, the SOA between notes was 500 ms.The subjects heard the initial 3 notes of one of two 8-notemelodies andthey had to continue imaginingtheremain-ing5 notes. In contrast to Experiment 1, the subjects werenot required to make a key-press synchronously with thelast note, thereby allowing us to record brain activity un-

    contaminatedwith thepreparatoryphase of makinga keypress. However, 1 s following the expected onset of thelast imagined note, the subjects heard one of six probenotes. Three of the probe notes were the same as the lastthree notesof the melodythey were supposedto imagine.Theother threeprobe notesfell outside of themelody, butwere constrained to be one semitone distant from thewithin-melody probes. The subjects task was to deter-mine as quickly and accurately as possible whether theprobe note was the same or different than one of the last

    182 Janata

  • 7/27/2019 Janata, P. (2001). Brain Electrical Activity Evoked by Mental Formation of Auditory Expectations and Images. Brain

    15/26

    three notes theyhad imagined. Thenexttrial wasinitiated1500 ms following the subjects response.

    In the CV conditions (figure 7B), a number cue ap-peared on the screen 750 ms after the trial type cue. Thisnumber predicted that they would hear either threenotes or five notes. 1250 ms following the appearance ofthe number, subjects heard either three or five notes of

    the melody (500 ms SOA). On 10% of the trials, the pre-diction was violated. Note that the violations were oftwo types. Sometimes, subjects heard only three noteswhen they were expecting five. This condition is analo-gous to the traditional omitted stimulus paradigm inwhich a repeating stimulus is unexpectedly omittedsome fraction of the time. In other CV trials, subjects

    Electrophysiology of Mental Image Formation 183

    Figure 7. Experiment 2 methods. A) In all imagery trials, subjects heardonly the initial three notes of themelody andimag-ined theremaining five. No key press wasrequired. A probenote waspresented 1000 ms followingthescheduledonset ofthelast imaginednote,andsubjects respondedwhetheror notthisnote wasone of thelast threenotes they had heard. B)

    Cue-validity trials. These trials formed a set of control conditions whose purpose was to establish conditions under whichemitted potentials would or would not be elicited. On each trial, a visual cue appeared on the screen predicting that ei-ther 3 or 5 notes would be heard. Either 3 or 5 notes were then presented. A visual stimulus was presented at fixation 1250ms following the offset of the last note and subjects responded whether the prediction had been accurate. In 90% of the

    trials, thepredictionwas valid. In10%of thetrials thepredictionwas invalid. There were twotypes of invalid trials. Presenta-tionof fewer notes thanexpected resultedin anunexpectedomission. Presentationof morenotes thanexpected resultedin an unexpected continuation.

  • 7/27/2019 Janata, P. (2001). Brain Electrical Activity Evoked by Mental Formation of Auditory Expectations and Images. Brain

    16/26

    heard five notes when onlyexpecting three, thereby con-stituting an unexpected continuation. One second afterthe onset of the 5th note epoch, a visual cue appeared onthe screen and subjects had to respond as quickly and ac-curately as they could whether or not the predictionmatched the number of notes they actually heard. The

    CV conditions required that subjects maintain attention,but did not force them to retain or form an image in audi-tory working memory when the predicted number ofnotes had passed. Subjects performed 120 imagerytrialsand 400 no-imagery trials. Thus, data were collected for20 of each of the expectancy violation trials in theno-imagery task.

    Electrophysiological recording and data analysis

    The EEG was recorded from 129 electrodes and ex-amined for artifacts as described above, with the sole dif-ference that data were acquired at 250 samples/s and the

    low-pass filter was set to 100 Hz. Given thesmallnumberof subjects, each subjectsaveragedatawere standardized.Z scores were computedbased on the mean and standarddeviation of average voltage values on all good channelsacross all time points in all task conditions. Thestandard-izedsubjectaverages were thenaveraged togetherto formthe grand-averagewaveforms. Topographical mapswereconstructedusing spline interpolation asdescribed above.

    Asinthe firstexperiment, the pointof the present ex-periment was to compare topographies during acousti-cally equivalentepochs experienced under different taskconditions, and to compare topographies elicited byheard notes with topographies elicited by other task

    events such as imagined notes, unexpected omissions orcontinuations, and other heard notes. Grand average to-pographies in the time window from 92-108 ms after theonset of the 4th event epoch were compared using thepermutation test of the D statistic described above. Thiswindow was specifically chosen so that the N100 topog-raphy for the 4th note in the 5 predicted, 5 heard CV con-dition could be compared with the topographies in theother conditions.

    In additionto the permutation test on the D statistic,the topography (vector of voltage values across elec-trodes) at each time point was correlated with the topog-raphies at all other time points. The correlation matrix Ris given by,

    ( ) ( )R

    X X

    XX XXij

    i, k j, kT

    k

    T

    i, i

    T

    j, j

    =

    ,

    where X is the time (rows) by electrode (columns) matrixof voltage values. By specifying the correlation between

    topographies recorded at differenttimepoints, thecorre-lation matrix R allowed us to identify temporally stabletopographical patterns in the data, including those thatcorrespond to the N100 and P200. Furthermore, the cor-relation matrix facilitated a statistical comparison of re-sponses tothe different heard notes in the melody aswell

    as activity during imagery or silence.The correlationvalues were transformed to t-values

    (Hays 1988, pg. 589). Given the large size of the correla-tion matrix, and thereby the large number of multiplecomparisons, the significance of each t-value was cor-rected by Bonferroni correction for multiple compari-sons. Because the correlation matrix is symmetrical andthe correlation values along the diagonal are necessarily1.0 and can therefore be ignored, the total number ofmultiple comparisons for the 1300 time points and fivetask conditions was ~4.22 x 106. Thus, the correctedp-value correspondingto a family-wise a=0.05was~1.19x 10-8. For df=127, this value corresponds to a correlation

    coefficient of 0.47.

    Results

    The grand-average activation time-course for allchannels across the entire trial in each of the conditions isshown in figure 8. The data are presented as electrode Xtime voltage (ETV) matrices to highlight the patterns thatare associated with the various events within and acrossdifferentconditions. The matrices in figure 8C and8D ap-pear to be less smooth than the matrices in the other pan-els. This is due to a smaller number of trials (max of

    20/subject)in these conditions comparedtotheother con-ditions (>100 trials/subject). The first note in the melody(onsetat 0 ms)generatesthesame clearETVpattern acrossthedifferentconditions. At 100 ms this pattern appears asa beaded column of red (positive) and blue (negative)voltageswhich is distinct from the preceding topographyin which channels 30-100 appear as mostly blue and theremaining channels as mostlyred. When viewedas a top-ographicmap(left inset above figure 8A), the beadedpat-tern is shown to correspond to the N100. The N100pattern largely inverts polarity during the following 100ms, and represents the P200 component of the AEP (mid-dle inset above figure 8A). A distinct N100/P200 pattern

    is also observed in response to the probe tone in the imag-ery condition at 4500 ms (figure 8A).

    The responses to successive heard notes in the mel-odyappear in the ETV matrices as repetitions of the basicETVpattern that is elicited by the firstnote in the melody(most clearly seen in figure 8A,B,E). The ETV responsepattern to the first note is considered to encompass theportion of the matrix from just after 0 ms to shortly after500 ms. Aside from the N100/P200 patterns already de-scribed, another feature of the ETVresponse to the heard

    184 Janata

  • 7/27/2019 Janata, P. (2001). Brain Electrical Activity Evoked by Mental Formation of Auditory Expectations and Images. Brain

    17/26

    Electrophysiology of Mental Image Formation 185

    Figure 8. See caption at end of paper.

  • 7/27/2019 Janata, P. (2001). Brain Electrical Activity Evoked by Mental Formation of Auditory Expectations and Images. Brain

    18/26

    note is a re-emergence of the N100 distributionin the last150 ms of the note epoch (right inset above figure 8A).Three repetitions of this basic pattern were observedwhen subjects heard three notes (figure 8A,B), and fiverepetitions are observed when they heard five notes (fig-ure 8E). Note that the P200 pattern is strongest in re-

    sponse to the first note in the melody.The prominent patterns that appear as vertical

    bands at ~3700 ms in figure 8B-E, represent the responseto the visual stimulus prompting subjects to indicatewhether or not the number of notes they heard was thenumber expected.

    The ETV patterns of most interest are those that oc-cur between 1500 and 2000 ms. In this time window, thetasks andsensory input across the conditions diverge. Adetail ofthe ETV matrices during thisepochare shownastopographic maps in figure 8F. The response to thefourth heard note when five notes were expected is seenat 1600 ms in Row E (figure 8F). The top three rows infigure 8F show the responses in the three conditions inwhich onlythreenotes wereheard,but inwhich the tasksdiffered. The toprow (A) shows the response when sub-jects imagined the first of five notes. The second rowfrom the top (B) shows the response when subjects ex-pectancies of hearing only three notes was fulfilled. Thethird row from the top (C) shows the response when thesubjects expectancies of hearing five notes were vio-lated. Each of these topographies should be comparedwith the topographical pattern in the bottom row (E)when a note was actually heard.

    When five noteswere expected but only threeheard

    (figure 8F, Row C), a distinct topographical sequenceemerged which was absent when the sequence termi-nated as expected (figure 8F, Row B). Thus, the topo-graphical pattern in Row C of figure 8F represents anemitted potential to an unexpected auditory stimulusomission. As in Experiment 1, the topographies occur-ring around the peak of the N100 of the heard stimulus(92108 ms from the onset of the epoch, or 15921608 mson the timescale in figure 8) were compared using the Dstatistic. The difference in topographies to the unex-pected omission in the 5-expected, 3-heard conditionwas significantly different from the topography in theexpected termination in the 3-expected, 3-heard condi-

    tion (p < 0.05) but not significantly different from theN100 topography to the fourth heard note in the5-predicted, 5-heardcondition(p = 0.74). In the imagerycondition(figure 8F, RowA), thetopography during theN100window alsodifferedsignificantly fromthe topog-raphy in the 3-expected, 3-heard condition (p0.47. Thus, the correlation between any pair of topogra-phies canbe considered statistically significant if thecor-responding point in the correlation matrix on the righthand side of figure 9 is black. The triangles indicate theonsets of auditory, visualand imagined events. Todeter-mine how strongly the topography at any given refer-ence time point in the trial was correlated with thetopography at any other time point, look at the matriceson the left-hand side of figure 9 and locate the reference

    time point along the diagonal. Points directly above thepoint onthediagonal represent correlations with preced-ing topographies, while points extending to the rightrepresent correlations with future topographies.

    Several salient features emerge from the correlationmaps. First, persistent topographicalstatesappearas tri-angles with their base along the diagonal. For example,following the fifth heard note, there is a large trianglepreceding the visual cue, indicating that the topographi-cal pattern between ~2600 and 3500 ms was stable. Fig-ure 8E shows the stable pattern. The same patternfollowing the expected termination of the three note se-quence (figure 8B between 2100 and 3500 ms) shows up

    as a largedark triangle in the correlation matrix in figure9B. Similarly, the repetitions of the ETV patterns de-scribed above for responses the individual heard notes,are seen as smaller dark triangles. Patches of increasedcorrelations, appearing as squares off the diagonal, dur-ing the heard notes show that evoked responses to thedifferentnotes in thesequence elicitedsignificantlysimi-lar topographies.

    Two important observations can be made about thebrain activity patterns that follow the last expected heardnote in the imagery and no-imagery conditions. First, inthe imagery condition (figure 9A), topographiesthroughout the 2 s period of imagery (1500-3500 ms) are

    strongly correlated not only with themselves, but alsowith topographies elicited during the preceding notes.Thetopographies duringimagery arenot asstrongly cor-related with the period between the end of imagery andonset of the auditory probe, suggesting a cessation in theimagery processes after the requisite number of noteshad been imagined. This shift in topography betweenimagery and rest is also visible in figure 8A. Second, thecorrelation matrices differ between the imagery and the

    186 Janata

  • 7/27/2019 Janata, P. (2001). Brain Electrical Activity Evoked by Mental Formation of Auditory Expectations and Images. Brain

    19/26

    Electrophysiology of Mental Image Formation 187

    Figure 9. See caption at end of paper.

  • 7/27/2019 Janata, P. (2001). Brain Electrical Activity Evoked by Mental Formation of Auditory Expectations and Images. Brain

    20/26

    acoustically identical 3-expected, 3-heard condition. Inparticular, the period between 2200 and 3500 whichshowed a substantial number of correlations with theheard notes in the imagery condition, showed very fewsignificant correlations with the heard notes in theno-imagery condition. However, during the silence im-

    mediately following the last heard note, the topographyin the 3-expected, 3 heard condition was strongly corre-lated with the P200 topography, and therefore inverselycorrelated with the N100 topography. This effect showsup in figure 9B as the light blue squares above the darkred triangle along the diagonal between 1700 and 2200ms. Whilethecorrelationwith theP200 topographydur-ing this timewas significant, the inversecorrelationwiththeN100 topographywas not. Surprisingly, thedurationof this effect was equal to the duration of theinter-event-interval, even though there were no exoge-nous timing cues! Thus, there appears to be an emittedpotential effectin theno-imagerycondition, butit is a dif-ferent process from that observed in the imagery condi-tion, given the different topographies.

    The mental state at the end of the 5-expected,5-heard condition no-imagery condition was conceptu-ally identical to that of the 3-expected, 3-heard conditiondescribed above. Subjects knew that there would be nocontinuation of the sequence beyond the fifth note. Asimilar, albeit shorter-lasting and less significant se-quence termination effect following the last heard notewas observed in the topographical pattern in this condi-tion also. Thus, there appeared to be something particu-larly special about the expected termination of the

    sequence after three notes, which gave rise to a topogra-phy that was significantly correlated with the P200. Onepossibility is that subjects knew that their expectationswould be violated on some number of trials (10% to beexact), either by an overshoot or undershoot of the num-ber of predicted notes. Whether an over- or undershootwould occur was always determined uniquely duringthe 4th note epoch. In other words, the fourth note was apivotal event in the trial, in that it would or would not bepresent on some trials. Support for the interpretationthat the fourth note epoch was particularly salient, evenwhen five notes were expected, comes from the topogra-phy correlations in this epoch (figure 9C). Specifically,

    there was a lengthening of the P200 topography, whichappears as a vertical band of weak or negative correla-tions with the dominant N100-like topography at ~1700ms along the diagonal. The P200 topography during the4th heard noteis significantlycorrelatedwith theP200 to-pography to the first heard note. The effect is seen mostclearly in the significance mask plot on the right of figure9C as a broader column of white separating the topogra-phy correlations among the individual heard notes.

    Discussion

    Experiment 2 replicated the findings from Experi-ment 1 and extended the comparison of potentials emit-ted under conditions of imagery with potentialsfollowing expected andunexpectedterminations of notesequences during attentive listening. As in Experiment1, imagining the continuation of a sequence of notes re-sulted in an emitted potential. The emitted potential to-pographies in the imagery condition, both immediatelyfollowing the last of the heard notes and throughout theimagery portion of the trial were distinct from the topog-raphies in the acoustically identical condition in whichsubjects expected to hear, and only heard, three notes.The imagery condition did not differ from the 5-heardnote condition or the unexpected omission conditionwithregardtothetopographyintheN100windowofthefourth event epoch, whereas all three of these conditionsdiffered significantly from the condition in which the

    sensory input terminated as expected.The similarity of the topographies during the N100time window among conditions in which subjects wereactively forming expectations or images suggests that theprocesses of forming a mental image of an auditoryeventactivates those brain regions activelyinvolvedin process-ingthesensory event. Note, however, that comparing be-tween the N100 topography and the topography of theearly portions of the emitted potential in response to animagined event is not the same as saying that imaginingan auditory stimulus generates an N100. For instance, atopography very similar to that of the N100 is observed torecur following the P200. Given that the duration of the

    sounds was 250 ms, the activity between 300 and 500 msfollowing stimulus onset may reflect an offset response(Hillyard andPicton1978;Pictonet al.1978). Nonetheless,the N100 is considered functionally distinct from the ac-tivity between 300 and 500 ms following stimulus onset.In summary,while the topographical similarity of the ini-tial 200 ms of the emitted potential and the N100 to heardnotes is used to argue that the imagery task involves acti-vation of the auditory cortex, the emitted potential is notconsidered to be equivalent to the "true" N100 which isdriven by exogenous stimuli (Ntnen and Picton1987).However, the emitted potential may be related to thetop-down modulators of the N100, because it appears in

    thetime windowrelative to stimulusonset in whichcom-parisonsof sensory input andexpectanciesarebelievedtotakeplace. Becausethese comparisonsarebelievedto takeplace in the auditory cortex (Ntnen 1992; NtnenandWinkler1999), theemitted potentialmay also dependon generators in the auditory cortex.

    The most direct evidence for activation of the audi-tory cortex in musical imagery tasks comes from func-tional neuroimaging experiments using PET by Zatorre

    188 Janata

  • 7/27/2019 Janata, P. (2001). Brain Electrical Activity Evoked by Mental Formation of Auditory Expectations and Images. Brain

    21/26

    and colleagues (Halpern and Zatorre 1999; Zatorre et al.1996). In their tasks, imagining familiar melodies acti-vates a network of brain areas, largely in the right hemi-sphere, including frontal, infero-frontopolar, as well asthe auditory cortex along the superior temporal gyrus(STG). Patients with right temporal lobe excisions per-

    form significantly worse on perceptual and imagerytasks involving pitch comparisons between lyrics in fa-miliar melodies than do controls or patients withleft-hemisphere lesions (Zatorre and Halpern 1993).Thus, the emitted potential results described here areconsistent with the evidence from the functionalneuroimaging and neuropsychology literature.

    Both the imagery task context as well as thecue-validity judgement context, in which listeners expectto hear a specific note at a specific time but none occurs,generated emitted potentials whose early topographieswere not significantly different from each other or fromthe N100 response to the corresponding heard event.

    Thus, it is possible, but not yet proven, that the neuralmechanisms underlyingtheformation of auditory imagesin an "imagery" task may not be so different from the for-mation of auditory expectancies in other tasks requiringauditory attention. A similar equivalence has been pro-posedbyKosslynfor the case of visual imagery and visualperception (Kosslyn 1994, p. 287): "images are formed bythe same processes that allow one to anticipate what onewouldsee ifa particularobjectorscenewerepresent."Au-ditory images/expectancies appear to also activate sen-sory cortices as has been demonstrated for the visualsystem (Kosslyn et al. 1993; Le Bihan et al. 1993; Mellet etal. 1998). Carroll-Phelan and Hampson (1996) have ar-gued for an approach to studying musical imagery withthe main hypothesis that the cognitive sub-systems un-derlying the perception of melodies are the same as thoseunderlying imagining of melodies.

    General Discussion

    Emitted potentials revisited

    The results of the present pair of experiments indicatethat emitted potentials arenota simplebrain response elic-ited by only a single set of task characteristics. Rather, the

    presence and morphology (sequence of topographicalstates) of an emitted potential depends strongly on specifictaskrequirements. While attentionappears to be necessaryto elicit an emitted potential, it may not be sufficient. Theresults of the second experiment suggest that in order toelicit an emitted response, it is necessary to form a mentalimage ofan auditoryevent evenifit isknown thattheeventwill notbe heard. In other words,theemitted potentialcanbe generated independently of an expectation that theevent will actually occur.

    Although emitted potentials have been identified inthepast almostexclusivelywith expectancy violationsandthe P300 component, the appearance of emitted potentialsin a mental imagery task should not discount the possibil-ity that the emitted potentials actually reflect processes in-volved in mental imagery. Furthermore, the results from

    the present experiments raise the question of whether theexplicit formationof auditory images in an imagerytask issimilar to the formation of auditory images described incountless studies on thePNandNd components which in-dex auditory attention during target detection.

    Expectancy formation unmasked by unexpected omissions

    Given the emphasis of earlier emitted potential stud-ies on establishing the equivalence of the emitted-P300andthe P300 (Besson et al. 1997; Ruchkinand Sutton 1978;Ruchkinetal.1980; Ruchkinetal.1975),the earliercompo-nents of emitted potentials were not examined in suffi-cient detail to rule out involvement of neural generatorsassociated with sensory processing. The data from thepresent experiments suggest thatthe early components ofemitted potentials to unexpected omissions indeed acti-vate neural substrates involved in processing incomingauditory information. Thisinterpretation is supported bytheextensive bodies of literature on the modulationof ac-tivityin the auditorycortexbetween50and 250 msfollow-ing stimulus onset by endogenous, top-down, processes(for a review see Ntnen 1992). Both the N100 and theMMN fall within this time window. The MMN has beeninterpreted recently as the stage at which "representa-

    tions" of auditory objects are extracted from afferent sen-sory activation and are accessible to top-down influences,such as built up templates or expectations (Ntnen andWinkler1999). Similarly, theN100is considered toconsistof both exogenous and endogenous factors (Ntnenand Picton 1987) and is believed to arise in the auditorycortex (Liegeois-Chauvel et al. 1994; Pantev et al. 1995;Verkindtet al.1995). Failingtopresentanexpectedstimu-lusmay therefore unmask theportion of the normal audi-tory evoked response which is due to top-downactivation. If the locus of interaction of bottom-up andtop-down informationiswithin theauditory cortex,ashasbeen proposed for both the N100 and MMN, it is reason-

    able toexpectthattop-down activationsmay alsoshow anactivation pattern that is characteristic of the N100.

    In this view, the emitted potentials described in thisstudy may be related to the "negative difference" (Nd) or"processing negativity" (PN) components of ERP wave-forms. The Nd and PN appear as a negative displace-ment of the ERP when subjects attend to a particularfeature of an auditory stimulus in the context of a targetdetection task, compared to when theypassively listen tothe same stimuli. Short-latency forms of the Nd and PN

    Electrophysiology of Mental Image Formation 189

  • 7/27/2019 Janata, P. (2001). Brain Electrical Activity Evoked by Mental Formation of Auditory Expectations and Images. Brain

    22/26

    are believed to arise in the auditory cortex whereas lon-ger-latency components of the PN may arise in the fron-tal lobes (Giard et al. 1988; for reviews of this extensiveliterature see Ntnen 1982; Ntnen 1990; Ntnen1992). The topography of these components is consistentwith that observed for the emitted potentials in the pres-

    ent study. The PN is believed to result from a compari-son process between an "attentional trace" and anincoming sensory stimulus within the auditory cortex.The "attentional trace" is defined as a voluntary mainte-nance (involving both the frontal lobes and auditory cor-tex) of an expected sensory representation, i.e. mentalimage of the target to be detected, that has been built upover multiple presentationsof thetarget (Ntnen1982;Ntnen 1990). Thus, the cognitive processes underly-ingthe"attentional trace"andvoluntarily formedimagesmay share the same functional neural architecture.

    Expectations for the presence vs the absence of an event

    It should be pointed out that expectations can existboth for the presence or the absence of an event. Thus,when subjects were told toonlyexpect threeor five notes,they presumably actively expected to nothear a fourthorsixth note, respectively. Indeed, an unexpected fourthnote eliciteda strong P300 (figure 8F, Row D). The expec-tation that a sequence would end after three notes re-sulted in a topographical pattern for 500 ms in the silentperiod that followed which was highly correlated withtheP200tothefirstnoteinthesequenceanduncorrelatedor weakly negatively correlated with the N100-like to-pography which dominated most of the heard note ep-

    ochs (figure 9B). Quite surprisingly, the duration of thiseffectmatched theSOA of thepreceding notes! A similareffect, but of shorter duration was observed followingthe last of five expected notes. Given that subjects wereinstructed to not continue imagining when only threenotes were predicted, thiseffect might reflectactive inhi-bition of mental image formation. Further data areneeded to fully address this issue.

    Absence of the N100-like topography tosubsequent imagined notes

    Oneprobleminarguingthat theemittedpotential ob-

    served in the imagery condition following sensory inputreflects auditory image formation, is the absence of dis-tinct modulations around the presumptive onsets of theremaining imagined notes. Before discussing potentialreasons for this observation, it must be emphasized, how-ever, thatthe lackof N100-like emittedpotentials to subse-quent imagined notes does not eliminate the fact that theemitted potential to thefirst note in the imagery conditionis task dependent. Nordoes it diminish theimportance ofthe significantly autocorrelated state during the remain-

    derof theimagery trial whichdisappears as imagery endsand the subject waits for the auditory probe. The topo-graphical state during imagery is also significantly corre-lated with responses to the second and third heard notes(figure9A).Thus, eventhough thefocal amplitude modu-lations of an N100-like topography disappear, other indi-

    cators of sustained image formation remain.One explanation for the absence of emitted poten-

    tials beyond the first one is rooted in the limitations of theERP methodology. In Experiment 1, subjects displayed asignificant degree of variability in the time at which theypressed the key as an indication of their expectancy forwhen the last imagined note should have occurred (112ms s.d. in the 3Icondition; 172 ms s.d. in the 5Icondition).If the timing of subjects mental imagery varied like theirkey presses, the temporal variability in any emitted po-tentials to imagined notes from trial to trial would havethe effect of attenuating, broadening, and/or cancelingthe N100-like peaks in the averaged ERP. Originally itwas thought that the key-press data might serve as ameans of aligning individual trials prior to averaging.Several problems precluded this approach. When thistechniquewas applied to thedata from theAH condition,the variability in response times across trials altered theearlier auditory EPs beyond recognition (data notshown). In the imagery conditions, theproblem of inher-ent variability in response timing was compounded bythe inability to predict the onset of imagining each note.For example, on some trials, subjects might imagine onenote a bit too early and another too late. On another trialthey might imagine every note a bit late. Thus, it would

    be impossible to attribute differences in the latency-adjusted averages at each event position to a true differ-ence between brain activations in listen and imagineconditions, or, instead, to an inability to properly alignsingle-trial evoked responses.

    Alternatively, the difference in emitted responsesacross the sequence of imagined notes may result fromchanges in other factors, such as the state of ongoing ac-tivity in the auditory cortex. Specifically, the emitted re-sponse during the first imagined note may reflect aninteraction with sensory memory which is necessarilyactive following the third note. Auditory sensory mem-ory appears to have a short component which lasts sev-

    eral hundred milliseconds from stimulus onset, and alonger component lasting 10-20 s (Cowan 1995). The in-teractionbetween a mentally formedimage andthe shortform of sensory memory would be unique to the firstimagined note. Interactions between the longer form ofsensory memory and mentally formed images might beexpected throughout the remainder of the imagery pe-riod, though it is uncertain whether the interaction of amental image with sensory memory during the firstimagined note would disrupt the sensory memory in

    190 Janata

  • 7/27/2019 Janata, P. (2001). Brain Electrical Activity Evoked by Mental Formation of Auditory Expectations and Images. Brain

    23/26

    such a way that it would no longer interact in the samewaywith subsequent images. In other words, would thepatterns of neuralactivity maintaining thesensorymem-ory remain the same in light of an interaction with atop-down expectancy/imagery process?

    The mechanisms by which mentally formed images

    interact with sensory memory cannot be answered giventhe present data. Nonetheless, the present data illustratethat a variety of emitted potentials can be elicited in a task-dependent manner, and they form a basis for further stud-ies of the relationship between mentally formed imagesandexpectancies,andtheirinteractionwithsensoryinput.

    Figure Captions

    Figure 3. Temporal compression of scalp topographydata (AC) andERP topographiesandtheirassociated re-sponse time distributions of key-presses made synchro-nously with the onset of the last heard or imagined note(DF). AC) ERPdata from thesame epocharedisplayedon three different time-scales as topographical maps.Each circle represents a view down on top of the head.The nose would be at the top of each circle. The dashedlines between subplots bracket corresponding data seg-ments. A) Each map is the average of topographies re-corded at two adjacent samples. B) Each map is theaverage of 8 samples. C) Each mapis theaverage 16 sam-ples. Blue and red indicate negative and positive voltagevalues, respectively. Notethatthe topographicalpatternsthat persist over extended periods of time in A are pre-served in B and C. DF) In each panel, histograms of all

    responsesfrom allsubjects aredisplayedabove thegrandaverage topographicalmaps. The colorscaleand orienta-tionofthetopographicalmapsisthesameasinA-C. Eachtopographical map is the average of six time points (48ms). D) Subjects heard 8 notes and were required to pressthe key synchronously with the onset of the final note. E)Subjects pressed a key at the time when they expectedthat the third imagined note would have occurred. F)Subjects pressed a key at the time when they expectedthat the fifth imagined note would have occurred.Figure 8. Grand-average Electrode X Time X Voltage(ETV) matrices showing the spatio-temporal patternacross theentire trial in each of thefive experimentalcon-

    ditions (A-E) and spatio-temporal patterns of activityduring the fourth note epoch displayed as conventionaltopographical maps (F). As in earlier figures, voltage isencoded by the color scale. A-E) The pattern of voltagevalues across all electrodes (topography) at any singletime point is represented as a column in the matrix. Thesequence of notes began at 0 ms. In A, B, C, the acousticalconditions are identical. The last heard note lasted from1000 ms until 1250 ms. The "onset" of the first imaginednote (A) was at 1500 ms. B) Three notes were expected,

    three notes were heard. C) Five notes were expected, butonly threewere heard. D) Threenotes were expected,butfive were heard. E) Five notes were expected, five noteswere heard. InA,anacousticprobewas presentedat 4500ms. In B-E, a visual probe was presented at 3500 ms. Thecolor bar in E applies to A and B also. The color bar in D

    applies to C also. Approximately 100 trials/subject con-tributedto theimageinA,180trials/subject to theimagesin B and E, and 20trials/subject tothe imagesin C and D.The average auditory evoked potential topographies re-corded in response to the first note of the melody areshown in the insets above panel A. The N100 shows thetypical negative focus at the vertex and positive ring atmore inferior sites on the scalp. The P200 to the first noteis characterized by a centro-frontal positivity that issomewhat right-lateralized. The average topographyfrom100msfollowingtheoffsetofthenoteuntiltheonsetof the next note (N1R) resembles that of the N100, thoughit is notas pronounced. F) Theletters in parentheses asso-

    ciate each row of topographical maps with panels AEshownabove. The maps show theactivation patterns in atime window encompassing the 4th event epoch (1500 2000 ms). In the case of the conditions shown in the topthree rows, there was no acoustic input during this timewindow; only the task conditions differed. In the bottomtwo rows, a note was presented at 1500 ms.Figure 9. Spatial correlation matrices as a function of taskcondition. Thematrices on the left show howstronglythetopography observed at any instant during the trial wascorrelated withtopographiesrecordedat other times dur-ing the trial. Onsets of notes are marked by the tips of theblack triangles. Predicted onsets of imagined notes aremarked by gray triangles. Locate a time point of interestalong the diagonal. The column of values above the pointonthediagonalshowsthe strengthof correlations with to-pographies preceding the event of interest, while pointsalong the row to the right of the location on the diagonalindicate the strength of correlation with topographies fol-lowingtheevent of interest. Periodsduringwhich there isvery littlechangeinthetopographyshowup astriangularareas emanating from the diagonal. See figure 8 for com-parison. Blackpoints in thematrices ontheright show thecorrelations which were statistically significant followingBonferroni correction for multiple comparisons.

    References

    Besson, M. andFata,F. An event-related potential (ERP) studyof musical expectancy: Comparison of musicians withnonmusicians. JEP: Human Percept. and Perf., 1995, 21:1278-1296.

    Besson, M., Faita, F., Czternasty, C. and Kutas, M. Whats in apause: Event-related potential analysis of temporal dis-ruptions in written and spoken sentences. Biological Psy-chology, 1997, 46: 3-23.

    Electrophysiology of Mental Image Formation 191

  • 7/27/2019 Janata, P. (2001). Brain Electrical Activity Evoked by Mental Formation of Auditory Expectations and Images. Brain

    24/26

    CarrollPhelan, B. and Hampson, P.J. Multiple components ofthe perceptionof musical sequences:A cognitiveneurosci-ence analysis andsome implicationsfor auditory imagery.Music Perception, 1996, 13: 517-561.

    Cowan, N. Attention and Memory. Oxford University Press,New York, 1995.

    Edelman,G.M.Therememberedpresent: a biological theory of

    consciousness. Basic Books, New York, 1989Galan,L., Biscay, R.,Rodriguez,J.L., Perez-Abalo, M.C.and Ro-

    driguez, R. Testing topographic differencesbetween eventrelated brainpotentialsby usingnon-parametric combina-tions of permutation tests [published erratum appears inElectroencephalogr. Clin. Neurophysiol., 1998 Nov;107(5): 380-381]. Electroencephalogr. Clin. Neurophysiol.,1997, 102: 240-247.

    Giard, M.H., Perrin, F., Pernier, J. and Peronnet, F. Several at-tention-related wave forms in auditory areas: a topo-graphic study. Electroencephalogr. Clin. Neurophysiol.,1988, 69: 371-384.

    Grossberg, S.How does a brainbuilda cognitivecode? Psychol.Rev., 1980, 87: 1-51.

    Halpern, A.R. and Zatorre, R.J. When that tune runs throughyour head: a PET investigation of auditory imagery for fa-miliar melodies. Cereb. Cortex, 1999, 9: 697-704.

    Hays, W.L. Statistics. (4th ed.). Harcourt Brace Jovanovich, Or-lando, 1988.

    Hillyard, S.A. and Picton, T.W. On and off components in theauditory evoked potential.Percept. Psychophys., 1978, 24:391-398.

    Janata, P. ERP measures assay the degree of expectancy viola-tion of harmonic contexts in music. J. Cogn. Neurosci.,1995, 7: 153-164.

    Joutsiniemi, S.L. and Hari, R. Omissions of auditory-stimulimayactivate frontal-cortex.EuropeanJournal ofNeurosci-

    ence, 1989, 1: 524-528.Kosslyn, S.M. Image and Brain. The MIT Press, Cambridge,1994.

    Kosslyn, S.M., Alpert, N.M., Thompson, W.L., Maljkovic, V.,Weise, S.B., Chabris, C.F., Hamilton, S.E., Rauch, S.L. andBuonanno, F.S. Visual mental imagery activates topo-graphically organized visual cortex: PET investigations.

    Journal of Cognitive Neuroscience, 1993, 5: 263-287.Le Bihan, D., Turner, R., Zeffiro, T.A., Cuenod, C.A., Jezzard, P.

    and B