Tartter Et Al_2002

download Tartter Et Al_2002

of 22

Transcript of Tartter Et Al_2002

  • 8/16/2019 Tartter Et Al_2002

    1/22

    Brain and Language  80,  488–509 (2002)doi:10.1006/brln.2001.2610, available online at http://www.idealibrary.com on

    Novel Metaphors Appear Anomalous at Least

    Momentarily: Evidence from N400

    Vivien C. Tartter, Hilary Gomes, Boris Dubrovsky, Sophie Molholm,and Rosemarie Vala Stewart

    City College of the City University of New York 

    Published online February 11, 2002

    This study addresses a central question in perception of novel figurative language: whetherit is interpreted intelligently and figuratively immediately, or only after a literal interpretationfails. Eighty sentence frames that could plausibly end with a literal, truly anomalous, or figura-tive word were created. After validation for meaningfulness and figurativeness, the 240 sen-tences were presented to 11 subjects for event related potential (ERP) recording. ERP’s first200 ms is believed to reflect the structuring of the input; the prominence of a dip at around400 ms (N400) is said to relate inversely to how expected a word is. Results showed nodifference between anomalous and metaphoric ERPs in the early window, metaphoric andliteral ERPs converging 300–500 ms after the ending, and significant N400s only for anoma-lous endings. A follow-up study showed that the metaphoric endings were less frequent (instandardized word norms) than were the anomalous and literal endings and that there weresignificant differences in cloze probabilities (determined from 24 new subjects) among thethree ending types: literal metaphoric  anomalous. It is possible that the low frequencyof the metaphoric element and lower cloze probability of the anomalous one contributed tothe processes reflected in the early window, while the incongruity and near-zero cloze probabil-ity of the anomalous endings produced an N400 effect in them alone. The structure or parsederived for metaphor during the early window appears to yield a preliminary interpretationsuggesting anomaly, while semantic analysis reflected in the later window renders a plausiblefigurative interpretation.   !   2002 Elsevier Science (USA)

    Key Words:   figurative language; metaphor; anomaly; N400; standard pragmatic model;cloze probability; semantic processing; selectional restrictions.

    By definition, figurative language can be taken in two ways: literally and oftenanomalously, as when we call a person a rock, and creatively (if the figure is novel),abstracting across the literal meanings of the component words. A central questionin understanding how we comprehend figurative language is whether both the literaland figurative meanings of the figure are immediately activated or if one of those

    meanings is normally achieved preferentially. The ‘‘standard pragmatic model’’ (cf.Gibbs & Gerrig, 1989; Glucksberg, 1991) assumes that we attempt a creative interpre-tation only after recognizing the nonsense of the literal interpretation. Alternatively,constructivist approaches (e.g., Gibbs, 1994) assume that we arrive at the sensibleinterpretation guided by context and shared conventions and presumptions, noticingthe literal interpretation only if the figure of speech is highlighted.

    Hilary Gomes and Sophie Molholm are also at the Albert Einstein College of Medicine.

    Address correspondence and reprint requests to Vivien C. Tartter, Psychology Department, City Col-lege of CUNY, 138th St. at Convent Avenue, New York, NY 10031. Fax: (212) 650-5865. E-mail:[email protected].

    4880093-934X/02 $35.00!  2002 Elsevier Science (USA)All rights reserved.

  • 8/16/2019 Tartter Et Al_2002

    2/22

    METAPHOR, ANOMALY AND N400   489

    The standard pragmatic model grew out of generative semantics, in which under-standing of words in combination was assumed to derive from a logical conjunctionof general features: Understanding ‘‘a canary has feathers’’ entails understandingthat canaries are birds; that birds have feathers; and by implication, that canaries are

    feathered. Understanding which meaning of ‘‘bill’’ (money, debt, or bird mouth)pertains in a particular phrase depends on whether other bird features occur in thatphrase; understanding that ‘‘colorless green’’ is anomalous depends on recognizingthe contradiction of ‘‘green,’’ marked as color, and ‘‘colorless’’ marked as not-color(Katz, 1990). Following the principles of logic connecting semantic features (e.g.,‘‘person’’ is animate and ‘‘rock’’ is inanimate), metaphor yields an anomaly whichmight trigger some special meaning-creating process. The standard pragmatic modelcontrasts metaphor and simile, since in simile the figurative nature of the language

    is indicated explicitly using ‘‘like’’ or ‘‘as’’ (as in ‘‘The person is like a rock’’).Simile is therefore hypothesized to be understood more immediately, with the con- junction triggering the special process. Indeed, one proposal has been that metaphorsare understood by conversion mentally to similes, spurring the special process.

    The standard pragmatic model suggests that figurative language should be slowerto understand than literal language, since it requires a two-step process. It furthersuggests for metaphor that the literal meaning would be derived before the figurativeone. And it suggests that similes should be easier and faster to understand than meta-phors. A number of experiments have demonstrated that these implications are false.

    First, given appropriate context, figurative sentences take no longer than literalsentences to verify (see, e.g., Ortony, 1980). Moreover, metaphors are no harder tocomprehend than similes, and subjects do not feel that metaphor and simile are mean-ing-identical, indicating that they are not ‘‘translating’’ one into the other (Gibbs,1994). Third, McElree and Nordlie (1999) have shown that meaningfulness decisionsfor literal and figurative strings of words are made following the same course, sug-gesting that the interpretation of figurative strings does not require first a literal parse.Finally, figurative interpretations seem to be derived automatically, not secondarily,

    and can interfere with comprehending literal meaning. Glucksberg (1989; andGildea & Bookin, 1982) found that subjects are slower at verifying sentences basedon literal meaning if there is also a figurative meaning: It takes longer to decide that‘‘some jobs are jails’’ is literally false than it does to decide so for ‘‘some desks aremelons.’’ This suggests that the figurative meaning is created naturally along withthe literal meaning, not as a secondary process.

    While there is evidence that figurative interpretation is a normal process not sec-ondary to literal interpretation, we must note that good figurative language does not‘‘feel’’ normal. Verbrugge and McCarrell (1977) describe original figures of speechas creating tension, as do jokes, relieved by the insight into the metaphor’s meaning.The insight is what makes metaphors ‘‘fun.’’ Johnson (1980) proposes that the ten-sion arises from the apparent contradiction, and the ‘‘fun’’ arises from the construc-tion of the common ground relating the metaphor topic (roughly the subject) to themetaphor vehicle (how you get from the topic to the ground, usually the predicateof the metaphor). If such constructive processes happen in understanding literal sen-tences, they do not produce as strong a sensation of tension and relief.

    It seems quite possible that the tension arises from semiawareness of the anomaly,

    and the relief, from its resolution. This is not to say that it is the perception of anomalythat triggers a search for figurative meaning (this Glucksberg’s research belies) al-though it could for new, creative metaphors. Rather, it could be that several interpreta-tions are derived in parallel, that the anomaly of the literal meaning is noted, and thatthe simultaneous perception of anomaly and meaning is enjoyable. Both alternativessuggest that the appreciation of figurative language derives from perceiving potential

  • 8/16/2019 Tartter Et Al_2002

    3/22

    490   TARTTER ET AL.

    anomaly,   contrary to the conclusion of much of the current research on metaphorcomprehension.

    The current study tests specifically for the possibility that anomaly is recognizedat some stage in the understanding of a metaphor. To this end, we recorded event-

    related potentials (ERPs) from subjects as they read novel metaphors. ERPs providea very powerful technique for studying the temporal and spatial patterns of corticalelectrical activity during stimulus processing. We were particularly interested in theN400, a component of ERPs which is sensitive to semantic relatedness and expec-tancy. Typically, the N400 (a negative-going peak about 400  ms after the target wordwhich generally has a central parietal maximum) elicited by a specific word is smallerwhen the word is expected and related to the semantic context than when it is lessexpected and less related to the context. This difference in amplitude is referred to

    as the N400 priming effect (Friederici, 1997; Kutas & Hillyard, 1980). We assumedthat in semantic processing of a metaphor if there is a point before common groundis constructed when the vehicle of the metaphor is seen as incongruous, there shouldbe a more negative N400 than for a literal sentence (an N400 effect).

    We note that ours is not the first study of ERPs and metaphor. Pynte, Besson,Robichon, and Poli (1996) reported N400s for very short common French metaphors(e.g., ‘‘those fighters are lions’’ and ‘‘those apprentices are jars [clumsy]’’), literalsentences, and unfamiliar metaphors, created by scrambling topic and vehicle fromthe familiar ones (e.g., ‘‘those apprentices are lions’’). Across experiments the meta-

    phors were presented alone or with a preceding sentence that contextualized the meta-phor appropriately (e.g., ‘‘They are not cowardly: Those fighters are lions’’) or not(e.g., ‘‘They are not naive: Those apprentices are lions’’). Pynte et al. found a reliableN400 effect even for the common figures of speech and a greater one for the novelfigures of the same form. Unsurprisingly, providing an inappropriate context in-creased the size of the N400 for both kinds of sentences. It is important to note,though, that Pynte et al. included no anomalous conditions. Since overall context(i.e., the interrelationships among the stimuli in the set) has been shown to affect the

    N400, at least for pairs of associated or unassociated words (Brown, Hagoort, &Chwilla, 1996; Deacon, Breton, Ritter, & Vaughan, 1991; Holcomb, 1988), resultsfrom a stimulus set including no truly anomalous sentences are difficult to interpret.Relative to the literal sentences the metaphors would be more anomalous, but wehave no way of knowing whether they are processed as anomalously as are trulyanomalous sentences.

    Moreover, while their N400 results suggest that the common metaphors wereviewed as anomalous, it is not clear that they were in fact viewed as figurative:Subjects were simply asked to read the sentences for comprehension and their brainwaves were recorded; they were not asked to paraphrase the meanings, so we do notknow how they interpreted the figurative ones. In addition, the sentences were allshort and of a given form, contexts unlike those of natural figurative language andtherefore unlikely to provoke the shared conventions for figurative speech. Finally,since the ‘‘figures’’ of speech in this study were all common, i.e., dead metaphors,their comprehension could require recognition, rather than construction, of the com-mon ground. Thus this study does not really tap into the process of understandingfigures of speech or show whether it entails a moment when the anomaly of the literal

    reading surfaces.The present study is designed to record N400s in   unfamiliar figures of speech,

    rated as good metaphors and understood metaphorically by subjects. We hypothesizethat if subjects automatically use context to construct the ground, new but good figu-rative sentences should not yield a more negative N400 (no larger an N400) than donew, good literal sentences. On the other hand, if the anomalous literal meaning of 

  • 8/16/2019 Tartter Et Al_2002

    4/22

    METAPHOR, ANOMALY AND N400   491

    a new figurative sentence is recognized at some point, there should be a reliable N400effect. We include in our test materials both literal and truly anomalous sentencesto provide anchor points for judging the magnitude of the N400 and to mitigate con-text effects.

    PRELIMINARY STUDY: STIMULUS VALIDATION

    Our purpose is to examine ERPs during the processing of metaphors, comparingthem to those occurring in understanding literal and anomalous sentences. Unlikeearlier research, we were interested in looking at novel, creative metaphors, whichwe hoped would produce on first encounter the experience of tension that other re-searchers have described. It was thus necessary to create new metaphors and ensurethat subjects would indeed perceive them figuratively and as ‘‘good’’ metaphors. Wealso needed sets of control sentences—literal and truly anomalous versions—thatnaive subjects would perceive as such. The first experiment therefore measured sub- jects’ rankings of candidate sentences on meaningfulness and figurativeness scalesto produce valid sets of stimuli.

     Methods

    Sentence preparation.   Eighty metaphors were created such that (1) each sentence expressed sufficientcontext to be appropriately interpreted with no additional information needed, (2) the figure was providedin only one word, and (3) the figurative word was the last word in the sentence. The last two requirementsderived from the need to have a clear starting point for assessing N400, which should be present approxi-mately 400 ms after the onset of the last word in the sentence. Once the metaphors were created, literaland anomalous versions were produced by altering the last word. So, for example, the metaphor ‘‘Hisface was contorted by an angry cloud’’ was made literal by substituting the word ‘‘frown’’ for ‘‘cloud’’and was rendered anomalous by substituting the word ‘‘map’’ for ‘‘cloud.’’

    The Appendix displays the ultimate sets of sentences. As indicated, the metaphors are not all original.Some were borrowed from the published literature on metaphor: For example, ‘‘the teenager’s face was

    a coral reef’’ was a version of a sentence used by Radencich and Baldwin (1985). Others were ‘‘created’’by consulting poems or noting figures of speech appearing in the newspaper, conversation, and so on.

    Subjects.   Seven monolingual native American English speakers attending college or graduate schoolwere volunteers for this experiment.

    Procedure.   Three blocks of sentences were created from the 240 stimuli. Each block contained all80 sentence frames, each with only one ending. Endings were pseudorandomly assigned to a block sothat there would be roughly equal numbers of literal, anomalous, and metaphorical sentences in each(27 of two types and 26 of the third in each block). Across the three blocks each sentence was presentedwith all three endings. The order of the sentences was separately randomized in each block.

    Each subject was asked to rate two blocks of the sentences, each on one of two 5-point scales. One

    scale assessed Meaningfulness (1     completely nonsensical to 5     very meaningful) and the otherassessed Figurativeness (1    strictly literal   to 5     strictly metaphorical). Thus each sentence framewas seen twice by each participant, once in one group and once in another group, with one ending beingrated on one scale and another ending being rated on the other scale. Each ending was rated on eachscale by between one and three of the subjects. The order of the scales and the blocks of sentences wereroughly counterbalanced across the seven subjects.

    Sentences were presented to the subjects in written format for the subjects to take home and rate attheir leisure. When the forms were given to the subjects, written and oral instructions were provided,and any questions the subject had were answered. For the meaningfulness scale, the written instructionswere as follows: ‘‘Please rate each of the following sentences on the ‘meaningfulness’ scale (1 com-

    pletely nonsensical; 5    very meaningful). For example, the sentence ‘All reserve materials must beeaten within the library’ would be rated as ‘1’ (completely nonsensical), and the sentence ‘Americansociety does not prioritize sleep as an important health factor’ would be rated as ‘5’ (very meaningful).Circle your answer.’’ For the metaphor scale, the instructions read as follows: ‘‘Please rate each of thefollowing sentences on the ‘metaphor’ scale (1 strictly literal; 5 strictly metaphorical). For example,the sentence ‘Cigarette smoke contains carbon monoxide’ is rated ‘1’ (strictly literal) and the sentence‘Smoking causes dandruff of the lungs’ is rated ‘5’ (strictly metaphorical). Circle your answer.’’

  • 8/16/2019 Tartter Et Al_2002

    5/22

    492   TARTTER ET AL.

     Results

    For each scale, the ratings of all participants on a given ending were averaged.For a sentence triad to be successful, the metaphor sentence was to score high (an

    average of 3.5 or more) on metaphoricalness and meaningfulness, the literal sentencewas to score low on metaphoricalness (less than 2.5) and high on meaningfulness(more than 3.5), and the anomalous sentence was to score low on meaningfulness(less than 2.5). We anticipated that the anomalous sentences might be rated highlymetaphoric, since subjects had to select between ‘‘metaphoric’’ and ‘‘literal,’’ andthe anomalous sentences were clearly not literal.

    Sixty-four of the original 80 triads met these criteria. For the criteria-meeting sen-tences, the literal endings were rated as 4.6     .9 on the meaningfulness scale and1.6 .8 on the metaphoric scale. The anomalous sentences were rated as 1.8 .9

    on the meaningfulness scale and, as we had anticipated, 4.3 .7 on the metaphoricscale. Finally, and most importantly, the metaphoric sentences were rated both meta-phoric (3.5 1.0) and meaningful (3.7  1.3).

    A score for each sentence ending on each scale was computed by averaging subjectratings for the 64 sentences that met the criteria. These were submitted to a one-wayrepeated- (across sentence frame) measures analysis of variance assessing the effectof ending type (literal vs metaphoric vs anomalous). Both scales showed highly sig-nificant differences [for meaningfulness, F (2, 62)145.94, p .0005; for metaphori-

    calness,  F (2, 62)

    258.22,  p

    .0005]. Post hoc  t  tests ( p

    .002) showed that theanomalous endings were rated as significantly less meaningful and more metaphoricalthan the literal and metaphorical endings, and the metaphorical sentences were ratedas significantly less meaningful and more metaphorical than the literal sentences.(We note that McElree and Nordle, 1999, also found that subjects judged figurativesentences as less meaningful than literal ones; the absence of the fuller appropriatepragmatic context of natural use may be responsible.)

    Of the 16 triads that did not pass muster, 10 were because the anomalous endingwas rated as meaningful, and this ending was changed. For 4 of the 16 sentences the

    anomalous endings were also changed, although they had been rated as unmeaningfulby the subjects, because the authors judged them as too easily interpreted. On onesentence all three endings were rated as both metaphorical and meaningful, and sothese were changed. A final sentence was changed after the ratings to maintain thesame phrase structure, the article ‘‘an’’ introducing the noun phrase across all threeversions. The new endings were validated by consensus of all the authors and fournew subjects.

    ERP Experiment

     Methods

    Subjects.   Eleven paid volunteers served as subjects. Their average age was 26 years. All subjectswere right-handed fluent English speakers, either monolingual or having learned English by the age of five years. By self-report all subjects were healthy with normal or corrected-to-normal vision.

    Stimuli.   The stimuli were produced and evaluated as described in the previous study. The sentencesemployed are displayed in the Appendix.

    The stimuli were presented visually on a dark computer screen typed in uppercase, 1.20-cm-tall, whitecharacters. A sentence frame with dots indicating the position of the final word, or ending, was presentedfirst, and then the final word was displayed. Both the sentence frames and their final words were centeredon the screen. The final words subtended approximately 0.87°  vertically and, on the average, 5.52°horizontally.

    Preceding the display of the frame a small green square indicating the focal point appeared on thescreen. It remained on the screen until the subject pressed a response button, triggering the sentence

  • 8/16/2019 Tartter Et Al_2002

    6/22

    METAPHOR, ANOMALY AND N400   493

    FIG. 1.   The configuration of recording electrodes viewed from the top of the head.

    frame 1 s later. Another button press indicated that the subject had read the frame, and 500 ms laterthe ending completed the sentence. The final word was displayed for 1 s, and the green square reappeared2 s later.

     ERP recording.   The recordings were obtained using a 31-channel electrode cap that incorporateda subset of the International 10-10 system (new and modified combinatorial nomenclature, AmericanElectroencephalographic Society, 1990): Fpz, Fz, Cz, Pz, Oz, FP1, FP2, F7, F8, F3, F4, FC1, FC2, FC5,FC6, T7, T8, C3, C4, CP1, CP2, CP5, CP6, P7, P8, P3, P4, O1, 02, LM (left mastoid), and RM (rightmastoid). Electrode placements are displayed in Fig. 1.

    Vertical eye movements were recorded with a bipolar configuration of FP1 and an electrode belowthe left eye. Horizontal eye movements were recorded at F7 and F8. A left earlobe electrode served asthe reference and an electrode at PO9 served as the ground. All impedances were maintained below 5k". Amplifiers with a gain of 30,000 Hz and filter settings of .05–100 Hz were used. The continuousEEG for all channels was monitored during the recordings so that any problems with the electrodes

    could be identified and feedback about excessive motor movement could be given. The total recordingepoch was 1100 ms including a prestimulus interval of 100 ms. The digitization rate was 256 Hz. Eachepoch was baseline-corrected across the entire sweep before artifact rejecting and averaging. The aver-ages from each block were baseline-corrected again using the average amplitude of the prestimulusportion of the epoch. Artifact reject levels were set at    100  µV for all electrodes to exclude blinksand movement artifacts. Individual block averages were visually examined for residual artifact.

     Design.   The 240 sentences constituted 3 blocks of 80. For a first group, 27 frames were randomlyselected to be used with literal endings, 26 frames were selected from the remainder to be used withmetaphorical endings, and the final set was used with anomalous endings. For a second group the sentenceframes in the first group that had been used with literal endings were now used with anomalous endings,

    the ones that had had metaphorical endings now ended literally, and the ones with anomalous endingsended metaphorically. A third group was created by using the frames with the endings not used in theother two groups. Each group was then divided in half (yielding six half-groups) and sentences withinit were pseudorandomized with three restrictions: (1) no more than two sentences of the same categoryappeared in a row; (2) both halves of each group contained near-equal (13 or 14) sentences of eachcategory; and (3) for three pairs of sentences the frames of which were structurally or semanticallysimilar (e.g., 20 and 21 in the Appendix), one member only of each pair appeared in a given half.

  • 8/16/2019 Tartter Et Al_2002

    7/22

    494   TARTTER ET AL.

    Presentation of all six half-groups was administered to each subject, with order determined by a Latinsquare counterbalancing the three pairs of blocks with order within each pair counterbalanced betweenpaired subjects.

    Procedure.   Subjects were tested individually. Upon signing the consent form and filling out thepersonal information sheet, each subject was fitted with an electrode cap and seated comfortably in a

    reclining chair in a normally illuminated, sound-attenuated room in front of a monitor. The distancebetween the center of the screen and the subject’s right eyeball was set at the beginning of the sessionto 31 in.

    The subjects were told that a small green square would appear in the center of the screen asan indication that a sentence was ready. They were instructed to press a response button to displaythe sentence frame and that when they had completed reading the frame, they should press the re-sponse button to display the sentence end. They were asked to try to stay still during the recording andto move and blink only when the green square was displayed. Fifteen practice sentences were presented,as the subjects were monitored for conformity to the instructions. No subject required a second practicerun.

    The subjects were also instructed that after each block they would be given a recognition task, forwhich they would have to choose from among several sentences those presented during that block. Therecognition task was necessary to ensure that subjects were in fact reading the sentences. For the recogni-tion test, one literal, one anomalous, and one metaphorical sentence were randomly selected from eachhalf of each block and then pseudorandomly mixed with six foils, two from each semantic category(sentences that were ‘‘experimental’’ in other blocks), such that no more than two experimental or twofoil sentences and no more than two sentences from each semantic category appeared in a row.

    The subjects were not alerted to the different semantic categories of the experimental sentences. Allbut one subject completed the six experimental runs and following recognition tasks in one sitting; theremaining subject took a 5-min break between the third and fourth blocks.

    After the trial was complete, subjects were disconnected from the recording computer and allowedto rest for 10–15 min. Then each subject was seated in a separate room, given a tape recorder and printedlist of the 240 sentences [in a different random order (the same for all subjects), created in the samemanner as for the experimental trials], and asked to state the number of each sentence and to restate inhis/her own words the meaning of the sentence.

    The entire procedure took 3 to 4 h: 1 h for hook-up, 1 h for the six blocks, 1 h for paraphrasing, andup to 30 min for breaks.

     Data analysis.   As a preliminary step in the data analysis, two judges scored the subjects’ paraphrasesin order to determine whether the sentences were perceived by the subjects as expected. Each paraphrasewas categorized as constituting a literal, anomalous, or metaphoric interpretation of the original. The

     judges had to know what the stimulus sentences were in order to judge the paraphrase. Consequently,they were aware of the category to which each original sentence was assigned. In rare cases of disagree-ment, the judges discussed the paraphrase in question until an agreement was reached. For the finalanalysis, only those sentence triads were chosen for which none of the sentences in the triad was para-phrased ‘‘incorrectly’’ by more than four subjects. This procedure yielded 55 sentence triads.

    For the 55 triads that were selected for the final analysis, ERPs time-locked to the onset of the finalword were averaged for each subject within each semantic category. The data were then averaged acrosssubjects and plotted as a function of time elapsed after the onset of the final word, with each of thethree semantic conditions shown separately. Examination of the grand averages indicated that the wave-form elicited by the literal sentences began separating from those elicited by the metaphoric and the

    anomalous sentences approximately 160 ms after the onset of the final word (see Figs. 2 and 3). Further,the waveforms elicited by the anomalous and the metaphoric sentences began separating from each otherat approximately 280 ms after the final word. Consequently, average amplitude was measured in twowindows that reflected the peaks of these separations: 200–300 ms (early) and 300–500 ms (late). Indi-vidual ERP data were measured for the two windows separately for literal, anomalous, and metaphorconditions, resulting in six average amplitude measures per subjects. The effect of semantic conditionon average amplitude at Fz, Cz, Pz, FC1, FC2, CP1, CP2, C3, and C4 was tested separately for theearly and late windows using two-way repeated-measures analyses of variance (electrode    sentencecondition). These electrode sites were chosen because N400 is typically found to be largest in the regionsthey represent. Post hoc analyses of variance were used to compare pairs of sentence types.

    As a second step in the analysis, the effect of semantic category on the late window was furtherevaluated, controlling for the amplitude of the waveform in the early period. Each subject’s averageamplitude in the early window was subtracted from his or her average amplitude in the later windowfor the respective semantic condition. The difference amplitudes thus obtained were then tested for aneffect of semantic conditions using a two-way analysis of variance (electrode     sentence condition).The electrode sites used in this analysis were the same as above. Post hoc analyses of variance wereagain used to compare pairs of sentence types.

  • 8/16/2019 Tartter Et Al_2002

    8/22

    METAPHOR, ANOMALY AND N400   495

    FIG. 2.   Average amplitudes at Cz for literal (thick), anomalous (dashed), and metaphoric (thin)

    sentences at early and late processing points.

    FIG. 3.   Grand mean ERPs elicited by the final words in the sentences at all electrode sites. Thethin lines are the ERPs elicited in the literal condition, the dashed lines are the ERPs elicited in themetaphoric condition, and the thick lines are the ERPs elicited in the anomalous condition. In this andall subsequent figures, stimuli were presented at time zero and marks on the   x  axis represent 100-mssteps. Waveforms in this and all subsequent figures were filtered at 30 Hz for clarity of display.

  • 8/16/2019 Tartter Et Al_2002

    9/22

    496   TARTTER ET AL.

    The topography of the early and late effects was also examined and compared across sentence type.For this analysis, the average amplitudes of the difference waves (anomalous minus literal and metaphoricminus literal) for the early and late windows were scaled according to the procedure described in McCar-thy and Wood (1985). Electrodes were then grouped by region: anterior (Fpz, Fz, Fp1, Fp2, F3, andF4), central (Cz, FC1, FC2, C3, C4, CP1, and CP2), posterior (Pz, Oz, P3, P4, O1, and O2), lateral left

    (F7, FC5, T7, CP5, P7, and LM), and lateral right (F8, FC6, T8, CP6, P8, and RM). The average scaledamplitudes for each region were submitted to a three-way repeated measures analysis of variance. Factorsfor this analysis were region (five levels), difference waveform (anomalous/literal difference vsmetaphor/literal difference), and time window (early vs late).

    An  #   level of .05 was used for all statistical tests. The Geisser–Greenhouse procedure was used tocorrect the degrees of freedom and  p value for the respective  F  test. Only corrected degrees of freedomand  p  values are reported.

     Results

    The grand average waveforms elicited by the three different endings are presentedin Fig. 3. The waveforms are similar to those reported in other N400 studies (forexample, Kutas, Lindamood, & Hillyard, 1984), except that the amplitude of the N400in the anomalous condition is not large enough to offset the early positive shift andpull the waveform below the baseline (also see Federmeier & Kutas, 1999; Kutas &Iragui, 1998). N400, which generally has a central parietal maximum (Friederici,1997), is seen for each of the three conditions at most electrode sites depicted as thelarge negative-going deflection peaking between 390 and 430 ms.

    An examination of the ERPs in Fig. 3 suggests that the ERPs elicited by the finalwords in the anomalous and metaphoric sentences begin to differ in amplitude fromthose elicited by the final word in the literal sentences at approximately 160 ms andremain separated for the rest of the epoch. The ERPs elicited by the anomalous andmetaphoric endings, while similar during early processing, begin to diverge at ap-proximately 280 ms. These observations are supported by the mean amplitudes atFz, Cz, Pz, FC1, FC2, CP1, CP2, C3, and C4 in the early (200–300 ms) and late(300–500 ms) windows depicted in Tables 1 and 2, respectively.

    Statistical analyses also support these observations. An analysis of variance com-paring the average amplitudes in the early window at nine electrode sites found asignificant effect of Semantic Condition [F (1, 10) 10.8, p .01]. Neither electrodenor the interaction were significant. The amplitude of the waveform in the early win-dow elicited by the literal endings was significantly more positive than those elic-ited by the anomalous or metaphoric endings [F (1, 10)     16.02,   p     .005, andF(1, 10)15.49, p .005, respectively], with the latter two not being different from

    TABLE 1

    Average Mean Amplitude (in Milliseconds) and Standard

    Deviations (in Parentheses) for the Early Window (200–300

    ms) for All Electrode Locations

    Literal Metaphoric Anomalous

    Fz 5.4 (4.2) 3.3 (3.5) 3.6 (3.2)Cz 5.4 (4.1) 3.3 (3.5) 3.6 (3.7)Pz 3.6 (3.4) 2.0 (2.9) 2.0 (3.6)FC1 5.4 (3.9) 3.5 (3.5) 3.5 (3.1)FC2 5.9 (4.3) 3.7 (3.3) 4.1 (3.3)CP1 4.7 (3.7) 2.8 (3.1) 3.0 (3.6)CP2 5.1 (3.8) 2.9 (3.0) 3.1 (3.5)C3 4.6 (3.2) 3.1 (2.9) 3.1 (3.0)C4 5.6 (3.6) 3.5 (2.8) 3.8 (3.1)

  • 8/16/2019 Tartter Et Al_2002

    10/22

    METAPHOR, ANOMALY AND N400   497

    TABLE 2

    Average Mean Amplitude (in Milliseconds) and Standard

    Deviations (in Parentheses) for the Late Window (300–500 ms)

    for All Electrode Locations

    Literal Metaphoric Anomalous

    Fz 6.9 (5.1) 4.6 (4.4) 2.9 (3.8)Cz 6.5 (4.6) 4.0 (4.2) 2.0 (3.9)Pz 5.1 (3.4) 3.0 (3.3) 1.4 (3.2)FC1 6.5 (4.5) 4.0 (4.2) 2.2 (3.9)FC2 7.1 (4.8) 4.8 (4.0) 2.8 (3.8)CP1 5.9 (3.5) 3.4 (3.6) 1.9 (3.6)CP2 6.0 (4.0) 3.6 (3.6) 1.9 (3.5)C3 5.6 (3.2) 3.5 (3.4) 1.8 (3.6)

    C4 6.7 (3.9) 4.2 (3.7) 2.5 (3.3)

    each other. An analysis of variance comparing the amplitudes in the later time win-dow also found a significant effect of Semantic Condition [F (1, 10)    13.6,   p  .005]. Again, neither electrode nor the interaction were significant. Post hoc analysesindicated significant differences among all three sentence types. Thus, anomalous

    endings resulted in significantly more negative deflections in the later window thandid literal or metaphoric endings [F (1, 10 15.36, p .005 and F (1, 10) 10.04, p .01, respectively], and metaphorical endings produced a significantly more nega-tive response than did literal ones [F (1, 10)    11.80,   p    .01]. That is, there is agreater N400 effect for anomalous sentences than for metaphorical sentences. More-over, if only the late time window region of the ERPs is examined, it also seemsthat there is a greater N400 for metaphorical than literal sentences.

    However, the amplitudes of the ERPs for the literal and metaphorical sentences

    differ greatly in the region of the early time window after which the waveformsfollow a parallel trajectory. Thus the apparent N400 effect in the later time windowfor metaphor may be attributable to the increased negativity in the early region. Thechange in average amplitude from the early to the late time region is relatively similarfor the metaphoric and literal endings, suggesting that the difference in amplitudebetween these two conditions in the later window is due to the earlier negativity. Incontrast, the amplitude separation between the waveforms elicited by the anomalousand the literal endings continues to increase from the early time window to the latetime window.

    Therefore, the effect of semantic category on the later time window encompassingthe N400 was further evaluated, controlling for the amplitude in the early time win-dow by using a measure of the difference in amplitude between early and late periods.An analysis of variance comparing these difference amplitudes found a significanteffect of type of ending [F (1, 10)     6.94,   p     .05]. Post hoc tests indicated nodifference in amplitude in the later time region between the literal and metaphoricalconditions [F (1, 10)    0.41,  ns] when the amplitude of the waveform in the earlywindow was controlled. However, the increased difference in amplitude of the late

    window (N400) elicited by the anomalous ending was significantly greater than thatelicited by either literal or metaphor endings [F (1, 10)     9.39,   p     .025 andF (1, 10) 10.99,  p .01, respectively].

    Topography of the early and late (N400) effects.   The grand average differencewaveforms derived by subtracting the waveforms elicited in the literal condition fromthose elicited in the anomalous (thick line) and the metaphoric (thin line) conditions

  • 8/16/2019 Tartter Et Al_2002

    11/22

    498   TARTTER ET AL.

    FIG. 4.   Grand mean difference waveforms constructed by subtraction of the ERPs elicited in theliteral condition from those elicited in the metaphoric (thin lines) and anomalous conditions (thick lines)at Fpz, Fz, Cz, Pz, Oz, FC5, CP5, FC6, and CP6.

    are depicted in Fig. 4 for a subset of the electrodes to illustrate the topography. Boththe early and late effects were largest at the central electrodes and somewhat smallerat the right and left lateral, anterior, and posterior electrodes. Further, for the anoma-lous sentences, the effect in the late time window appeared to be somewhat largeron the right than on the left. Both the central maximum and the right-sided bias areconsistent with the topography reported for the N400 effect in the literature (Feder-meier & Kutas, 1999; Friederici, 1997; Kutas & van Petten, 1994). However, thetopographic effects were small and found to be nonsignificant.

    DISCUSSION

    The present experiment was designed to determine whether novel metaphors wouldelicit an N400 effect as do anomalous sentences, indicating perhaps that, initially,figurative language is seen as anomalous, following which a constructive processallowing figurative interpretation is triggered. We obtained similar ERP results formetaphoric and anomalous sentences only during an early epoch not generally associ-ated with semantic analysis (Friederici, 1997), after which the metaphoric sentences

    produced ERPs following similar trajectories to those produced by literal sentences.Most critically, in the later time window associated with N400, a pronounced negativedeflection which usually signals incongruity was observed relative to the amplitudesin the early window only for the anomalies.

    Friederici (1997) has identified a window in the vicinity of N200 (encompassedby our early epoch) as reflecting a primarily syntactic process, with the window

  • 8/16/2019 Tartter Et Al_2002

    12/22

    METAPHOR, ANOMALY AND N400   499

    around N400 attributed to lexical-semantic processing. The divergence in the earlywindow of literal from metaphoric and anomalous sentences in our study—all of which were syntactically well-formed—suggests, if Friederici is correct, that someaspect of this syntactic process is sensitive to selectional constraints among words,

    a possibility consistent with Chomsky’s (1965) Extended Standard Theory of syntax.Thus the early window also reflects a form of lexical processing, with perhaps afuller semantic analysis indicated by the later window activity, where, in our study,metaphor converges with literal interpretation and diverges from anomaly.

    It is important to note, however, that the N400 effect has been obtained with wordpairs, which do not constitute a phrase or clause to which a parse or semantic analysiscould be applied. Thus, to some extent the N400 may reflect simply the expectancyof a particular lexical item, where expectancy is determined either by lexical parame-

    ters such as word frequency or by a fuller linguistic context. Before making inferenceson the nature of the syntactic and semantic processing underlying literal, metaphoric,and anomalous sentences as reflected by our ERP results, we need to determine thedegree to which our endings differ in lexical parameters related to expectancy.

    FOLLOW-UP EXPERIMENT

    The present study was undertaken to determine if novel metaphors are processedas are literal sentences, with sense constructed from the onset, or as are anomalous

    sentences, with initial surprise at the metaphoric element, followed when possibleby, or perhaps even triggering, construction of figurative meaning. To test this weput together a list of novel metaphors, some created and some taken from the literatureon metaphor processing, and from them derived literal and anomalous sentences bychanging the metaphoric element appropriately. We then rated the set of sentencesfor metaphoricalness and meaningfulness, yielding a stimulus set that could be readby subjects while we recorded their ERPs. ERP analyses revealed similar processingfor anomalous and metaphoric sentences in an early window, but their divergence

    at N400, when the metaphoric sentences, like the literal sentences, showed a positivedeflection.

    The literature on N400 suggests that a pronounced negative deflection for anoma-lous elements is caused by their incongruity or the resulting failure at constructinga sensible interpretation. To some extent both congruity and ease of construction of interpretation relate to predictability: The final word in ‘‘For breakfast I had baconand socks’’ is not predictable given both probability in the language and the semanticinterpretation constructed by the preceding context.

    Apart from anomaly, there are other variables that lead to low predictability of aword in context. For example, statistically, low frequency words should be less ex-pected than high frequency words, and so if unpredictability underlies the N400 effectwe would expect a greater effect for low than high frequency words. As a secondexample, consider the predictability of different words within a particular context,the cloze probability. ‘‘A stitch in time saves ten’’ is as sensible as ‘‘a stitch in timesaves nine,’’ but is much less predicted by context. If the N400 effect arises from asimple discrepancy between the word expected and the word supplied, and if anoma-lous endings are less expected than literal or metaphoric ones, there could be a greater

    N400 effect for the anomalies independent of the ability to construct sense for them.Indeed Kutas, Lindamood, and Hillyard (1984) demonstrated that the size of theN400 varies inversely with cloze probability.

    Therefore to follow-up on our ERP results we further analyzed our stimuli, de-termining the expectedness of our ending elements as a function of word frequencyand cloze probability.

  • 8/16/2019 Tartter Et Al_2002

    13/22

    500   TARTTER ET AL.

     Methods

    Subjects.   For the cloze portion of the test a French–English native bilingual examined the stimuliof Pynte et al. (1996) and then created sentences using metaphors as well-known to English speakersas the French ones would have been to the French speakers in the Pynte et al. study. The English and

    English-translated French metaphors were informally evaluated by several of her bilingual friends anddeemed equivalently ‘‘trite.’’

    Twenty-four City College students, native or near-native English speakers (began speaking Englishbefore the age of 5, with all their schooling in English), were recruited primarily from psychology classesfor the cloze experiment. They received course credit or were paid for their participation.

     Materials.   For the frequency analysis, the frequency of the final word for each semantic conditionwas computed from the Kuc ˇera and Francis (1967) word norms.

    For the cloze analysis, 40 ‘‘trite’’ metaphors were created so that (a) the metaphor would be familiarto English speakers, and (b) like the experimental sentences, the metaphor would hinge on only one,the final, word. Examples of these are ‘‘Despite her good grades, my roommate is a complete airhead’’;

    ‘‘Wherever she goes, Donna spreads sunshine’’; and ‘‘Please open a window; this room is a sauna.’’The 40 new metaphors and the 80 original sentence frames were randomized together into two different

    random orders. The final word of the sentence was replaced by dashes, and, if a noun, the precedingarticle was rendered as ‘‘a(n).’’ The two random orders were printed into booklets.

    Procedure.   For the cloze experiment, subjects were tested for the most part in small groups. Eachsubject received a booklet, with about half the subjects tested in each order. The booklet opened withthe written instructions (modified minimally from Bloom & Fischler, 1980):

    On the following pages are 120 sentences each with the final word left blank. Your task is simplyto read each sentence at your normal rate, and write down the word that first occurs to you as alikely end of that sentence. For example, if the sentence ‘‘frame’’ were, ‘‘The party did not enduntil   ,’’ possible responses might include ‘‘dawn,’’ ‘‘three,’’ ‘‘late,’’ ‘‘midnight,’’ andso forth. Don’t try to be unique or average; just be natural. You should keep within the followingbounds however: (1) only one response word per sentence; (2) the word should ‘‘make sense’’of the sentence, and be from an appropriate class of words (nouns, verbs, adjectives, etc.); (3)English words only; (4) try to avoid repetitions. For some of the sentences, the response will seemobvious; for others, any number of words will seem possible. This is of course intentional sincewe are interested in the whole range of sentence constraints.

     Results

    To calculate cloze probability for each word, the number of times that word wasprovided by subjects for that sentence frame was calculated and then divided by 24(the number of subjects). For each of the three endings for the 80 sentence framesused in the ERP experiment, both frequency and cloze probability are displayed inthe Appendix, with the item.

    Word frequency.   Table 3 displays the average frequency for the 55 literal, meta-phoric, and anomalous endings of the sentences analyzed in the ERP experiment.One-way repeated-measures (the sentence frame yolked a particular literal, meta-

    phoric, and anomalous word) analysis of variance corrected for sphericity showed asignificant effect of semantic condition [F (1, 54) 6.31,  p .002]. Post hoc testsshowed that the metaphoric endings were drawn from a less frequently occurring

    TABLE 3

    Mean Word Frequency (as Derived from the

    Kuc ˇera and Francis, 1967, Word Norms) for

    the 55 Literal, Metaphoric, and Anomalous

    Endings to the Sentences Analyzed in the ERP

    Experiment

    Literal Metaphoric Anomalous

    60.21 10.36 41.73

  • 8/16/2019 Tartter Et Al_2002

    14/22

    METAPHOR, ANOMALY AND N400   501

    TABLE 4

    Mean Cloze Probability (as Determined from 24 Subject

    Responses for the 55 Literal, Metaphoric, and Anomalous

    Endings to the Sentences Analyzed in the ERP Experiment, as

    Well as to 40 Sentence Frames Designed to Have Trite Meta-phoric Endings

    Literal Metaphoric Trite Metaphoric Anomalous

    .19 .017 .227 .0005

    sample than were the literal or anomalous endings, which were not significantly dif-ferent from one another.

    Cloze probabilities.   Table 4 displays the mean cloze probabilities for the sen-tences used in the ERP study, along with the average cloze probability for the tritemetaphors created for this study for comparison purposes. A repeated-measures anal-ysis of variance as was conducted for frequency, comparing the three endings usedin the ERP experiment, showed a significant effect of semantic condition [F (1, 54) 37.54,   p     .001], with post hoc tests revealing a higher cloze probability for theliteral endings than for either of the other two, as well as a higher cloze probabilityfor metaphoric endings than for anomalous ones. A   t   test for independent sampleswith unequal variance comparing the cloze probabilities of the test metaphors withthose of the trite ones showed significantly less predictability of the test sentences[t (39)   4.89,  p .005].

     Discussion

    The purpose of the follow-up experiment was to determine the word variables that

    might have differentiated our stimuli, apart from their metaphoricalness or meaning-fulness. Confirming our design, our metaphorical endings were shown to be signifi-cantly less predictable than a foil set of metaphors created to be comparable to theset of metaphors that Pynte et al. had used for French speakers. Thus, our ERP test,while replicating Pynte et al.’s general finding of a smaller (less positive) N400 formetaphoric than literal sentences, is actually a quite different result: (1) Our test usedmore creative, unfamiliar metaphors and (2) our test did not show a significant N400effect, given the baseline of the early window.

    The follow-up results demonstrated that apart from meaningfulness and metaphori-calness, our endings differed in more mundane semantic measures: the metaphoricendings had a lower word frequency than did the other two ending types and a lowercloze probability than did the literal endings; the anomalous endings had a signifi-cantly smaller, near-0 cloze probability. In some sense, these results are quite reason-able given the way the stimuli were derived. First, one would expect a near-zerocloze probability for anomalous endings, since subjects in the cloze experiment wereasked to ‘‘make sense,’’ and in selecting the anomalous ending, the experimentershad the opposite purpose. Likewise, since subjects in the cloze experiment were in-

    structed not to try to be unique (or average), they would not be seeking a poeticcreative-metaphor ending, in contrast to the selection process of the experimentersdesigning the stimuli. Second, in creating the anomalous and literal sentences theexperimenters began with the metaphoric sentence, found in the literature or sponta-neously occurring and therefore relatively unconstrained, and then thought of alterna-tive endings. The endings that are most likely to come to mind are words that are

  • 8/16/2019 Tartter Et Al_2002

    15/22

    502   TARTTER ET AL.

    more available or codable, closer to ‘‘the top of cognitive deck’’ (Brown, 1958, p.236), which are more often high frequency than low frequency words.

    However, the finding of semantic variables, apart from meaningfulness and meta-phoricalness, covarying with them in our study, mandates interpretation. It may be

    that the deviance for the literal sentences in the early window of the ERPs fromthose of the metaphoric and anomalous sentences reflects an ease of construction of interpretation and structure, determined not by a literal process  per se,  but by theavailability of the words and their predictable fit to context, both of which may bepart and parcel of normal, literal language interpretation. The greater N400 effectfor the anomalous words as compared to the metaphoric and literal words may reflectthe near-zero predictability for anomaly, the extreme deviance from expectedness,rather than, or in addition to, the difficulty in creating a sensible semantic interpre-

    tation (see also Kutas et al., 1984). It is more difficult to isolate an effect of wordfrequency   per se  on the ERP waveforms. The metaphoric words were of a lowerfrequency class, yet their waveforms grouped with those of the anomalous words inthe early window, and with those of the literal words in the later window. So, if word frequency affected the waveforms, it did so in conjunction with some othervariable(s).

    GENERAL DISCUSSION

    The present studies were designed to determine whether good novel metaphoricsentences are interpreted using constructive processes in common with literal sen-tences or whether their novel, literally anomalous meaning is noted first, perhapstriggering a special figurative process as suggested by the standard pragmatic theory.To this end we created, collected, had rated, weeded, and edited sets of sentencesthat could then be confidently considered as metaphoric, literal, or anomalous. Wethen measured event-related potentials (N400) believed to reflect recognition of 

    anomaly, a presumably semantic process, for these sentences in new subjects. Thosesubjects were also given a recognition task to ensure that they were reading the sen-tences and a paraphrasing task to ensure that they interpreted them as we intendedand as the results from the rating study suggested they would. Finally, we examinedthe frequency of occurrence in English of our final words and their cloze probabilitieswith respect to the sentence frames.

    Our ERP results suggest that the question of whether figurative language is per-ceived first as anomalous cannot be answered with a simple yes–no. The resultsclearly show a divergence in processing of anomalous sentences from literal andmetaphoric sentences, a larger N400 effect for the anomalous sentences. In the regionof N400, metaphoric sentences also differ from literal sentences, suggesting at firstglance that they too may be perceived as more anomalous. However, the ERP resultsalso showed a significant difference in the early epoch among the three sentencetypes, with literal and metaphoric sentences diverging in an early window encom-passing N200, and thereafter following parallel trajectories. Thus, when the ampli-tudes in the early epoch were controlled for, there was no N400 effect for the meta-phors.

    Friederici (1997) has identified the N200 window with preliminary sentence struc-turing: ‘‘an initial syntactic structure is assigned to the incoming information on thebasis of word category information alone: during this stage, incoming words arestructured into phrases (noun phrase, verb phrase), and grammatical roles (subject,object) are considered’’ (p. 64). Since in the current study, across semantic condi-tions, endings associated with the same frame shared part of speech, and yet we found

  • 8/16/2019 Tartter Et Al_2002

    16/22

    METAPHOR, ANOMALY AND N400   503

    a difference, we suggest that more than word category information and grammaticalrole figure into this initial structuring. One aspect that may figure in is word accessi-bility: Our metaphoric endings were significantly less frequently occurring accordingto the Kuc ˇera and Francis (1967) word norms than were our other endings. A second

    aspect that may come into play was suggested intriguingly many years ago by Chom-sky (1965) in his Extended Standard Theory of syntax, a suggestion for a syntacticoperation that goes beyond phrase structure assignment. In this theory, word categoryinformation is incorporated in ‘‘strict subcategorization rules’’ and word selectionis governed by ‘‘selectional restrictions.’’ The latter ensured, for example, that if ananimate word was required for the frame, one would be selected. If, as Chomskyproposed, selectional restrictions operate in the assignment of structure, a metaphoricsentence like ‘‘the camel is a desert taxi’’ would contain a violation, since ‘‘camel’’

    would select for an animate object.Thus, it seems quite possible that a structure could be  completely assigned in theearly epoch only for the literal sentences, not for the others. While for each frameall three sentence types should be assigned the same phrase marker, for the literalsentences only would a preliminary check of selectional features yield a match andcertainty. If Friederici is correct that processing in the region of the N200 reflectssyntactic or structure processes, the separation we obtained for literal sentences fromanomalous and metaphoric ones in this window suggests that early structure processesentail examination of selectional restrictions, resurrecting this aspect of the Extended

    Standard Theory of generative grammar.If the ERPs for literal sentences remained different from those for metaphor and

    anomaly we could conclude that the standard pragmatic theory for figurative languageis correct—that a literal interpretation is first attempted, and only when rejected isa figurative interpretation tried. However, in fact, our ERP amplitudes for metaphoricand literal sentences were statistically equivalent in the region of the N400 showingsimilar  relative  changes from the early to the late window encompassing N400, thecomponent identified with anomaly detection by a considerable body of research (for

    a review, see Kutas & Van Petten, 1994). In this later epoch, in Friederici’s (1997)framework, lexical-semantic processes are operating. Following this framework, ourresults suggest that after initial assignment of structure, including that based on selec-tional features, semantic processes, perhaps constructive, kick in to assign interpreta-tions to both literal and metaphoric sentences. In contrast, for anomalous sentences,an interpretation cannot be assigned, resulting in a large N400 effect for them alone.

    As we discussed, ours is not the first study of N400 and metaphor. Pynte et al.(1996) measured N400s for common French metaphoric expressions. They reporteda significant N400 effect for these, a result which ours appears to contradict. Thecontradiction is more remarkable given that their study used familiar metaphors andours used novel ones, which one might expect would produce more surprise andtherefore greater apparent anomaly. Indeed, metaphoric sentences in English whichwe created to be similarly trite to the French ones used by Pynte et al. showed asignificantly greater cloze probability than did our ‘‘more original’’ metaphors. Ourinterest in metaphors and ERPs was to try to tap the constructive process in languageunderstanding, a process that could be unnecessary for common, already interpreted,expressions. And in this regard, we must emphasize that (1) we used only sentences

    validated by independent subjects as metaphoric/literal, anomalous/meaningful ap-propriate to the categories ‘‘literal,’’ ‘‘metaphoric,’’ and ‘‘anomalous’’; and (2) sub- jects from whom we recorded ERPs later paraphrased the sentences and for the mostpart interpreted them in accordance with our preassigned categories (91% of the sen-tences were correctly paraphrased by a half or more of the subjects).

    So why might Pynte et al. have found a significant N400 effect and we not for

  • 8/16/2019 Tartter Et Al_2002

    17/22

    504   TARTTER ET AL.

    metaphoric sentences? First, we measured the ERP amplitude in the region of theN400 relative to the ERP amplitude in an earlier region, and used the difference asour measure of negativity; they compared the amplitudes of N400 across sentencetypes. We found a significant N400 effect only for anomalous sentences. They found

    that N400 was more negative for their metaphoric than their literal sentences, whichis not to say that the effect was specific to this time window. Second, our studyemployed anomalous and literal sentences as controls and randomized all three sen-tence types in each block, so expectation of a particular sentence type would beneutralized. In their first experiment they included 50% literal sentences, 25% famil-iar metaphors, and 25% unfamiliar metaphors (these were endings for familiar meta-phors matched with ‘‘wrong’’ frames and may in fact have been uninterpretable—anomalous—in the minds of the subjects). For word pairs, it has been shown that

    list probability and context affect N400 (Brown, Hagoort, & Chwilla, 1996; Deacon,Breton, Ritter, & Vaughan, 1991; Holcomb, 1988). If this holds true also for sen-tences, subjects expecting literal sentences (because they were twice as likely as eitherof the others) could show an N400 effect for either of the others because they wererelatively unexpected. Thus the N400 they obtained could be an artifact of their de-sign, not a sign that metaphor is initially processed as anomaly. (Similar list probabili-ties that could affect the N400 in and of themselves exist in their other experiments.)Nevertheless, we concur with Pynte et al., that metaphoric sentences are not processedas are literal sentences, but we locate the difference in the earlier epoch, which they

    did not report. We disagree that metaphors are processed as are anomalies and haveERP measures for anomalous sentences in the same subjects who read the metaphorsto substantiate this position.

    With regard to list probabilities affecting the size of the N400, it is important tonote that our metaphors, while creative and interpretable (and in many cases usedin previous studies in the literature on metaphor processing), differ from our anoma-lous and literal endings in predictability. The literal and anomalous endings weredrawn from a more frequently occurring-in-the-language sample, and the literal end-

    ings were provided in a cloze experiment as a filler for the frame more often thanwere the metaphoric words, while the anomalous ones were offered as endings tothe frames significantly less often, indeed almost never. While frequency does notseem to have a clear impact by itself on the activity in either the early or the latewindows, the cloze probabilities could be responsible for the relative amplitudes inthe different semantic conditions in the later window, at N400: The lower the clozeprobability the more negative the N400, a result consistent with the findings of Kutaset al. (1984). However, what produces a particular word selection for a sentenceframe in the cloze task is a combination of syntactic and semantic fit. So, given thefull sentence context, what we believe we see reflected in both the cloze probabilitesof production and the ERPs of comprehension is the implementation by the late win-dow of a constructive semantic interpretation and a parse for both the metaphors andthe literal sentences, but not for the anomalies.

    In sum, we have demonstrated that metaphoric sentences are neither ‘‘fish norfowl’’—they are processed differently from literal sentences in the early structure-assigning epoch, and they are processed differently from anomalies in the region of the N400, when semantic interpretation may be assigned, and where they, and not

    anomalous sentences, are indeed interpretable. The results suggest that some ‘‘seman-tic’’ analysis may occur as part of structure assignment in line with Chomsky’s (1965)Extended Standard Theory, but that fuller semantic analysis does not read the meta-phor as anomalous in contradistinction to the ‘‘standard pragmatic theory.’’ The par-tially overlapping syntactic and semantic analyses indicating violation of selectional

  • 8/16/2019 Tartter Et Al_2002

    18/22

    METAPHOR, ANOMALY AND N400   505

    restrictions but interpretability may be what renders the tension that makes figurativelanguage fun.

    APPENDIX

    Stimulus Sentences Validated by Subjects

    Asterisks indicate those used in the final ERP analyses. The metaphoric versionis presented on the top line. Substitution for the last word of the word marked (a)yields the anomalous version and of the word marked (l), the literal version. Follow-ing each final word are two numbers. The first is the word’s frequency, and the secondthe cloze probability (see under Follow-Up Experiment).

    1. The winter wind tossed the earth’s lacy blanket (30, 0)placemat (0, 0) (a) snow (59, .29) (l)

    2. The chimney belched forth soiled wisps of cotton (38, 0)wool (10, 0) (a) smoke (41, .58) (l)

    *3. The orchestra filled the concert hall with sunshine (8, 0)hail (10, 0) (a) music (216, .54) (l)

    *4. The skillful diplomat alleviated the internal tug-of-war (2, 0)hopskotch (1, 0) (a) crisis (82, .08) (l)

    *5. Spring makes green the woodland’s bare skeletons (1, 0)spheres (4, 0) (a) trees (0, .08) (l)

    *6. The children playing in the park trampled the soft green carpet (13, 0)bedspread (2, 0) (a) grass (53, .75) (l)

    *7. The hunter’s approach silenced the chattering underbrush (1, 0)moss (9, 0) (a) animals (58, .13) (l)

    *8. The flowers were watered by nature’s tears (34, .08)laughter (22, 0) (a) rain (70, .38) (l)

    9. The leaves were tossed by the earth’s gentle whisperings (1, 0)rollings (0, .04) (a) breeze (14, .58) (l)

    *10. His face was contorted by an angry cloud (28, 0)map (13, 0) (a) frown (1, .04) (l)

    11. The country’s border was marked by a concrete serpent (2, 0)elephant (7, 0) (a) wall (160, .58) (l)

    The previous 11 sentences were adapted from Gerrig and Healy (1983).

    *12. The camel is a desert taxi (16, 0)

    table (198, 0) (a) animal (68, .67) (l)*13. Not even Einstein’s ideas were all gold (52, 0)

    coal (32, 0) (a) great (665, .08) (l)*14. Some jobs are prisons (3, 0)

    houses (83, 0) (a) boring (5, .33) (l)

    The previous three sentences were adapted from Glucksberg (1989).

    *15. The bell sounded and the employees streamed from the anthill (0, 0)

    bedroom ( 52, 0) (a) factory (32, .29) (l)*16. The rush hour train stopped and out poured the sardines (2, .04)

    tuna (0, 0) (a) commuters (0, .21) (l)17. For the musical, the actress captured Broadway’s superbowl (0, 0)

    helmet (1, 0) (a) Tony (1, .08)

  • 8/16/2019 Tartter Et Al_2002

    19/22

    506   TARTTER ET AL.

    18. On his head he sported a rug (13, 0)chair (66, 0) (a) toupee (0, .08) (l)

    *19. The shampoo effectively removed the snowflakes (1, 0)ice (45, 0) (a) dandruff (0, .5) (l)

    *20. The teenager’s face was a coral reef (11, 0)sea (95, 0) (a) pock-marked (0, 0) (l)

    The previous sentence was adapted from Radencich and Baldwin (1985).

    *21. The avenger’s face was a sealed furnace (11, 0)shower (15, 0) (a) hiding anger (48, 0) (l)

    *22. He hates the slime that sticks on filthy deeds (8, 0)rules (0, 0) (a) toilets (4, 0) (l)

    *23. In the photograph he was doing a Napolean (7, 0)Lincoln (47, 0) (a) salute (3, 0) (l)

    The previous two sentences were adapted from Gibbs (1994).

    24. Sermons are like sleeping pills (0, .08)grapefruit (3, 0) (a) lectures (15, .29) (l)

    *25. Cigarettes are like timebombs (0, .04)furniture (39, 0) (a) cigars (2, .04) (l)

    The previous two sentences were adapted from Glucksberg and Keysar (1990).*26. We wait for the sun to blow out (33, 0)

    drip (1, 0) (a) set (414, .17) (l)27. The Beatles were more popular than Christ (97, .08)

    the navy (37, 0) (a) the Stones (12, .04) (l)28. The prison guard was a hard rock (75, 0)

    noodle (0, 0) (a) judge (77, 0) (l)

    The previous sentence was adapted from Winner, Rosensteil, and Gardner (1976).

    *29. The worker exhausted his fuel (17, 0)chickens (13, 0) (a) energy (100, .08) (l)

    *30. For their daughter’s wedding the couple proposed a ceasefire (7, 0)highchair (0, 0) (a) toast (19, .42) (l)

    *31. The Mona Lisa was DaVinci’s Hamlet (7, 0)play (200, 0) (a) masterpiece (9, .54) (l)

    32. Federal funding for abortion is a minefield (0, 0)pasture (14, 0) (a) controversy (26, .08) (l)

    33. He recognized it as a great idea as soon as it erupted (7, 0)sank (18, 0) (a) appeared (135, .25) (l)

    *34. He sank into the featherbed, enjoying its soft embrace (13, .13)taste (59, 0) (a) feel (216, .17) (l)

    *35. Hoping to prevent a scene, she tried to lower his thermostat (6, 0)computer (13, 0) (a) rage (16, 0) (l)

    *36. The lifeguard sparkled with the healthy glow of tanned cancer (25, 0)headache (5, 0) (a) skin (47, .46) (l)

    37. The archeologists found the ancient dump to be an encyclopedia (1, 0)atlas (12, 0) (a) treasure-trove (0, .04) (l)

    *38. The buttery pastries melted all over his arteries (16, 0)kidney (6, 0) (a) chin (27, 0) (l)

    *39. Touching a turtle causes it to retreat into its armor (4, 0)nightgown (0, 0) (a) shell (22, .96) (l)

  • 8/16/2019 Tartter Et Al_2002

    20/22

    METAPHOR, ANOMALY AND N400   507

    *40. It was hard to see the road, the air was so soupy (0,0)beefy (1, 0) (a) foggy (5, .33) (l)

    *41. When the car broke down she had to thumb (10, .04)pinky (1, 0) (a) hitchhike (0, 0) (l)

    42. From every corner cockroaches peeked like so many black plums (0, 0)apples (6, 0) (a) ants (7, .13) (l)

    *43. The cheap cushion seemed stuffed with old rocks (23, 0)shoes (44, 0) (a) clothes (89, 0) (l)

    *44. In the spring the brown branches are covered in tiny emeralds (6, 0)sapphires (0, 0) (a) leaves ( 49, .13) (l)

    45. He argued his test grade with the dragon (1, 0)horse (117, 0) (a) teacher (80, .33) (l)

    46. The rock star sweated hormones (2, 0)kidneys (5, 0) (a) profusely (3, .38) (l)*47. Mt. Everest is Nepal’s Yellowstone (0, 0)

    tree (59, 0) (a) attraction (15, .08) (l)48. Billboards are a highway’s warts (5, .08)

    dimples (0, 0) (a) eyesores (0, 0) (l)

    The previous sentence was adapted from Verbrugge and McCarrell (1977).

    49. A hat-trick in hockey—three goals—is a player’s grand slam (1, .46)

    heart (173, 0) (a) dream (64, 0) (l)50. Before the tornado the sky was a brilliant amethyst (0, 0)

    diamond (8, 0) (a) purple (13, 0) (l)51. On hot summer nights, the children played in the fire hydrant geysers (2, 0)

    mudbath (0, 0) (a) water (442, .29) (l)*52. Cave paintings are ancient graffiti (1, 0)

    music ( 216, 0) (a) art (208, 0) (l)*53. After their leafy feasts, caterpillars wrap in silk hammocks (0, 0)

    underwear (3, 0) (a) cocoons (0, .67) (l)*54. A night of heavy drinking makes your stomach a whirlpool (1, 0)

    lobster (1, 0) (a) upset (14, .04) (l)*55. Hypnosis opens memory’s dams (3, 0)

    trees (101, 0) (a) secrets (20, 0) (l)*56. The Christmas stocking was stuffed to the gills (0, 0)

    lungs (20, 0) (a) top (204, .21) (l)*57. Before gargling his breath smelled swampy (1, 0)

    red (197, 0) (a) horrible (15, .25) (l)58. The gym-teacher taught warm-up exercises as boot-camp (0, 0)

    algebra (2, 0) (a) required (182, 0) (l)59. There is only so much deposit I am willing to eat (61, 0)

    drink (82, 0) (a) forfeit (3, 0) (l)*60. Homing pigeons have a built-in compass (13, .04)

    ruler (3, 0) (a) locator (0, 0) (l)61. He jumped from the chair and evaporated (2, 0)

    disintegrated (0, 0) (a) left (480, .08) (l)

    *62. The multi-vehicle accident set off an alarm symphony (33, 0)violin (11, 0) (a) many alarms (1, 0) (l)

    *63. The spring wind created a pollen blizzard (7, .04)blackout (5, 0) (a) cloud (28, .04) (l)

    64. After the engine was replaced the car needed to be debugged (0, 0)baked (8, 0) (a) retuned ( 0, .33) (l)

  • 8/16/2019 Tartter Et Al_2002

    21/22

    508   TARTTER ET AL.

    *65. She inhaled the sea perfume (10, 0)cat (23, 0) (a) breeze (14, .21) (l)

    *66. Rice is the Orient’s wheat (9, .04)clothes (89, 0) (a) grain (27, 0) (l)

    *67. Her favorite outfit was plagiarized (0, 0)watered (7, 0) (a) copied (3, 0) (l)

    *68. After a week of no rain the plants were panting (9,0)typing (7, 0) (a) wilting (0, .13) (l)

    *69. The heavy smog created an emergency room flood (19, .04)snow (59, 0) (a) crowd (53, .04) (l)

    *70. Her wrinkled appearance was smoothed with a scalpel (0, 0)grape (3, 0) (a) facelift (0, .13) (l)

    *71. The teacher had trouble with the student’s hieroglyphics (0, 0)acorn (0, 0) (a) writing (117, .13) (l)*72. Samson was a biblical Hercules (3, 0)

    nymph (1, 0) (a) strongman (0, .04) (l)*73. Tuberculosis was the 19th century AIDS (27, .04)

    month (130, 0) (a) plague (6, .5) (l)*74. Alzheimer’s slowly destroys one’s hard-drive (0, 0)

    magazine (39, 0) (a) brain (45, .29) (l)*75. After five o’clock the financial district is a cemetery (15, 0)

    sky (58, 0) (a) deserted (15, .08) (l)*76. The therapist helped the patient reach shore (6, 0)

    leaves (49, 0) (a) peace (198, .04) (l)77. The theatre company produced a Shakespeare banquet (6, 0)

    floor (158, 0) (a) festival ( 27, .04) (l)78. His imagination could be seen with a microscope (8, .13)

    hammer (9, 0) (a) difficulty (76, 0) (l)*79. Nelson Mandela is South Africa’s Lincoln (47, 0)

    paper (157, 0) (a) savior (6, .21) (l)*80. Fluoridating water is dental penicillin (1, 0)

    cook (47, 0) (a) hygiene (3, .21) (l)

    REFERENCES

    American Encephalographic Society. (1990). Standard electrode position nomenclature. Bloomfield, CT.

    Bloom, P. A., Fischler, I. (1980). Completion norms for 329 sentence contexts.  Memory & Cognition,

    8,  631–642.

    Brown, C. M., Hagoort, P., & Chwilla, D. J. (1996). An event-related brain potential analysis of visualword priming effects. In D. Chwilla (Ed.),   Electrophysiology of word processing: The lexical pro-cessing nature of the priming effect . Self-published dissertation ISBN 90-9009317-6; printed andbound by Koninklijke Wohrmann b. v., Zutphen.

    Brown, R. (1958).  Words and things.  New York: The Free Press.

    Chomsky, N. (1965).  Aspects of the Theory of Syntax.  Cambridge, MA: MIT Press.

    Deacon, D., Breton, F., Ritter, W., & Vaughan, H. G., Jr. (1991). The relationship between N2 andN400: Scalp distribution, stimulus probability, and task relevance.  Psychophysiology, 28,  185–200.

    Federmeier, K. D., & Kutas, M. (1999). A rose by any other name: Long-term memory structures andsentence processing.  Journal of Memory and Language,  41,  469–495.

    Friederici, A. D. (1997). Neurophysiological aspects of language processing.  Clinical Neuroscience,  4, 64–72.

    Gibbs, R. W., Jr. (1994).  The poetics of mind: Figurative thought, language, and understanding.  NewYork: Cambridge Univ. Press.

  • 8/16/2019 Tartter Et Al_2002

    22/22

    METAPHOR, ANOMALY AND N400   509

    Gibbs, R. W., Jr., & Gerrig, R. J. (1989). How context makes metaphor comprehension seem ‘‘special.’’ Metaphor and Symbolic Activity, 4,  145–158.

    Glucksberg, S. (1989). Metaphors in conversation: How are they understood? Why are they used?  Meta- phor and Symbolic Activity, 4,  125–143.

    Glucksberg, S. (1991). Beyond literal meanings: The psychology of allusion. Psychological Science,  2, 146–152.

    Glucksberg, S., Gildea, P., & Bookin, H. A. (1982). On understanding speech: Can people ignore meta-phors?  Journal of Verbal Learning and Verbal Behavior,  21,  85–98.

    Glucksberg, S., & Keysar, B. (1990). Understanding metaphorical comparisons: Beyond similarity. Psy-chological Review,  97,  3–18.

    Holcomb, P. J. (1988). Automatic and attentional processing: An event-related brain potential analysisof semantic priming.  Brain and Language,  35,  66–85.

    Johnson, M. (1980). A philosophical perspective on the problems of metaphor. In R. P. Honeck &R. R. Hoffman (Eds.),  Cognition and figurative language  (pp. 25–46). Hillsdale, NJ: Erlbaum.

    Katz, J. J. (1990).  The metaphysics of meaning.  Cambridge, MA: Bradford Books of MIT Press.

    Kuc ˇera, H., & Francis, W. N. (1967).  Computational analysis of present-day modern English.  Provi-dence, RI: Brown Univ. Press.

    Kutas, M., & Hillyard, S. A. (1980). Reading senseless sentences: Brain potentials reflect semanticincongruity. Science,  307,  161–163.

    Kutas, M., & Iragui, V. (1998). The N400 in a semantic categorization task across 6 decades.  Electroen-cephalography and Clinical Neurophysiology,  108,  456–471.

    Kutas, M., Lindamood, T. E., & Hillyard, S. A. (1984). Word expectancy and event-related potentialsduring sentence processing. In S. Kornblum & J. Requin (Eds.), Preparatory states and processing

    (pp. 217–237). Hillsdale, NJ: Erlbaum.Kutas, M., & Van Petten, C. K. (1994). Psycholinguistics electrified: Event-related brain potential investi-

    gations. In M. A. Gernsbacher (Ed.),  Handbook of psycholinguistics (pp. 83–143). San Diego, CA:Academic Press.

    McCarthy, G., & Wood, C. C. (1985). Scalp distribution of event-related potentials: An ambiguity associ-ated with analysis of variance models. Electroencephalography and Clinical Neurophysiology,  62, 203–208.

    McElree, B., & Nordle, J. (1999). Literal and figurative interpretations are computed in equal time.Psychonomic Bulletin & Review,  6,  486–494.

    Ortony, A. (1980). Some psycholinguistic aspects of metaphor. In R. P. Honeck & R. R. Hoffman (Eds.),Cognition and figurative language  (pp. 69–83). Hillsdale, NJ: Erlbaum.

    Pynte, J., Besson, M., Robichon, F.-H., & Poli, J. (1996). The time-course of metaphor comprehension:An event-related potential study.  Brain and Language,  55,  293–316.

    Radencich, M. C., & Baldwin, R. S. (1985). Cultural and linguistic factors in metaphor interpretation. Bilingual Review,  12,  43–53.

    Verbrugge, R. R., & McCarrell, N. S. (1977). Metaphoric comprehension: Studies in reminding andremembering. Cognitive Psychology,  9,  494–533.

    Winner, E., Rosenstiel, A. K., & Gardner, H. (1976). The development of metaphoric understanding. Developmental Psychology,  12,  289–297.