Lower, Slower, Louder: Vocal Cues of Sarcasm

13
Lower, Slower, Louder: Vocal Cues of Sarcasm Patricia Rockwell 1 Previous studies have examined verbal rather than vocal aspects of irony. The present study con- siders how vocal features may cue listeners to one form of irony—sarcasm. Speakers were recorded reading sentences in three conditions (nonsarcasm, spontaneous sarcasm, posed sarcasm) with the resulting utterances filtered to remove verbal content. Listeners (n = 127) then rated these filtered utterances on amount of sarcasm. Results indicated that listeners were able to discriminate posed sarcasm from nonsarcasm but not spontaneous sarcasm from nonsarcasm. An analysis of the vocal features of the utterances as determined by perceptual coding indicated that a slower tempo, greater intensity, and a lower pitch level were significant indicators of sarcasm. 483 0090-6905/00/0900-0483$18.00/0 © 2000 Plenum Publishing Corporation Journal of Psycholinguistic Research, Vol. 29, No. 5, 2000 1 Department of Communication, University of Louisiana at Lafayette, Lafayette, Louisiana 70504. Contemporary theories of verbal irony pay lip service to the importance of vocal cues in ironic utterances (e.g., “ironic tone of voice,” Clark & Gerrig, 1984, p. 122; “a tone of contempt,” Sperber, 1984, p. 135). None of these theories, however, has considered how vocal features may cue listeners to ironic content. Because of irony’s unique antithetical nature, vocal features may actually be more important than verbal features in aiding listeners in determining irony. In interactive communication, a common form of irony is sarcasm. The American Heritage Dictionary (1982) defines sarcasm as “a sharply mocking or contemptuously ironic remark intended to wound another,” (p. 1091). This study tested the proposition that vocal features play a major role in indicating sarcasm and attempted to determine the nature of those vocal features. KEY WORDS: sarcasm; irony; vocal cues; nonverbal behavior.

Transcript of Lower, Slower, Louder: Vocal Cues of Sarcasm

Page 1: Lower, Slower, Louder: Vocal Cues of Sarcasm

Lower, Slower, Louder: Vocal Cues of Sarcasm

Patricia Rockwell1

Previous studies have examined verbal rather than vocal aspects of irony. The present study con-siders how vocal features may cue listeners to one form of irony—sarcasm. Speakers were recordedreading sentences in three conditions (nonsarcasm, spontaneous sarcasm, posed sarcasm) with theresulting utterances filtered to remove verbal content. Listeners (n = 127) then rated these filteredutterances on amount of sarcasm. Results indicated that listeners were able to discriminate posedsarcasm from nonsarcasm but not spontaneous sarcasm from nonsarcasm. An analysis of the vocalfeatures of the utterances as determined by perceptual coding indicated that a slower tempo,greater intensity, and a lower pitch level were significant indicators of sarcasm.

483

0090-6905/00/0900-0483$18.00/0 © 2000 Plenum Publishing Corporation

Journal of Psycholinguistic Research, Vol. 29, No. 5, 2000

1 Department of Communication, University of Louisiana at Lafayette, Lafayette, Louisiana70504.

Contemporary theories of verbal irony pay lip service to the importance ofvocal cues in ironic utterances (e.g., “ironic tone of voice,” Clark & Gerrig,1984, p. 122; “a tone of contempt,” Sperber, 1984, p. 135). None of thesetheories, however, has considered how vocal features may cue listeners toironic content. Because of irony’s unique antithetical nature, vocal featuresmay actually be more important than verbal features in aiding listeners indetermining irony. In interactive communication, a common form of ironyis sarcasm. The American Heritage Dictionary (1982) defines sarcasm as“a sharply mocking or contemptuously ironic remark intended to woundanother,” (p. 1091). This study tested the proposition that vocal featuresplay a major role in indicating sarcasm and attempted to determine the natureof those vocal features.

KEY WORDS: sarcasm; irony; vocal cues; nonverbal behavior.

Page 2: Lower, Slower, Louder: Vocal Cues of Sarcasm

THE NATURE OF IRONY

Irony has been variously defined. The standard definition states that anutterance is ironic if the speaker’s intent is in opposition to the literal mean-ing (Preminger, 1974). Typically in an ironic utterance, the speaker’s intentis negative and the literal content is positive, as in “What a beautiful day!”uttered during a storm. (The opposite case is possible, but rare. For exam-ple, a speaker may say jokingly, “You sure blew that exam!” to a friendwho just received a high score on a test). Grice (1975) considered ironicutterances a flouting of his “quality maxim” in that speakers do not saywhat they mean. However in irony, unlike other speech acts that violate thequality maxim, this flouting is intentional. The listener is ostensibly awareof the speaker’s intent, knows the speaker cannot be telling the truth, andthus assumes the speaker is proposing the opposite of the uttered remark(Brown & Levinson, 1987; Levinson, 1983; Sperber & Wilson, 1986).Therefore, irony researchers suggest that in order to perceive irony accu-rately, listeners must be aware of more than mere literal content.

For example, Winner and her colleagues (Winner, Levy, Kaplan, &Rosenblatt, 1988) set forth three steps listeners must complete in order tocomprehend irony. The first is (1) detection of speaker intent. This step,Winner argues, is the most difficult. Once this step is passed, the remainingtwo steps—(2) detection of the relationship between what is said and what ismeant, and (3) detection of the unstated meaning—are achieved easily (p. 54).

If perceiving irony requires a determination of unstated speaker intent,how do most listeners do it? The answer may lie in the nonverbal literature.

NONVERBAL CONNECTIONS

Possibly the reason irony researchers have been only marginally suc-cessful in determining the verbal features of irony is because irony repre-sents a phenomenon that is more nonverbal than verbal in nature. Althoughmost irony theorists admit that the primary purpose of irony is the indicationof speaker intent or attitude, they contend that this attitude is best expressedby the verbal features of the utterance (Jorgensen, Miller, & Sperber, 1984;Kreuz & Glucksberg, 1989; Sperber, 1984). Most nonverbal researchers,however, have determined that attitudes are more clearly expressed nonver-bally than verbally (Argyle, Alkema, & Gilmour, 1971; Mehrabian &Wiener, 1967). Therefore, if determination of speaker attitude is a prerequi-site in perceiving irony, the best means of determining this attitude may bethrough nonverbal cues.

Another factor is that ironic intent is usually negative (Winner et al.,1988), as is the case with sarcasm. Nonverbal rather than verbal display of

484 Rockwell

Page 3: Lower, Slower, Louder: Vocal Cues of Sarcasm

this negativity allows the speaker deniability and the opportunity to saveface (Sperber & Wilson, 1986; Winner et al.,1988). Thus, it stands to rea-son that speakers who wish to convey a negative message, but do not wish toincur any negative consequences, may express this idea nonverbally.

There may also be physiological reasons for certain nonverbal charac-teristics of sarcasm (Buck, 1984). Nonverbal researchers indicate that posi-tive arousal may lead to a typical positive facial expression, characterizedby an upward pull on the facial muscles of the mouth and a relaxation ofthe vocal mechanism. On the other hand, negative arousal may result in atypical negative facial expression, characterized by a downward pull on thefacial muscles and a tightening of the jaw and vocal mechanism. Thus, ifsarcasm is prompted by negative arousal, there may be tangible physiolog-ical reasons why sarcasm has certain nonverbal components. Vocal sarcasmmay also occur as a muscular response to a sarcastic facial expression.Facial expressions characteristic of sarcasm may include a sneer, rollingeyes, or deadpan expression. These facial expressions may “leak” into thevocal channel, as Bugental (1974) suggests, and cause the perception of sar-casm even when the listener cannot see the face of the speaker. Thus,speakers expressing a negative attitude facially may automatically be express-ing it vocally too. (One can also argue that, contrarily, the sarcastic voice“leaks” into the facial expression.)

Other support for vocal sarcasm comes from developmental studies.Researchers (Ackerman, 1983; McDevitt & Carroll, 1988; Winner et al.,1988) have examined children’s and adults’ understanding of irony andhave discovered significant differences between these age groups. Adultsare better at detecting irony, researchers claim, because adults consistentlyrely on nonverbal rather than verbal cues when verbal and nonverbal chan-nels conflict, as they do in irony and sarcasm (Krauss, Apple, Morency,Wenzel, & Winton, 1981; Sigelman & Davis, 1978); children, however,generally rely on verbal cues when verbal and nonverbal channels conflict(Winner et al., 1988) and thus are not as adept as adults are at detectingirony and sarcasm.

These various findings from the nonverbal domain suggest a possibleconnection between nonverbal cues and sarcasm. Thus, because sarcasm isprimarily an indicator of speaker attitude and represents a contradiction ofthe verbal utterance, it may be argued that nonverbal cues provide an addi-tional, and possibly a better, way of detecting sarcasm than verbal cues.

PROPOSED VOCAL FEATURES OF SARCASM

Because of the connection between speaker attitude and nonverbalbehavior, research conducted on the vocal expression of emotions, similar

Sarcasm 485

Page 4: Lower, Slower, Louder: Vocal Cues of Sarcasm

to those typically conveyed in sarcasm, may offer some direction regardingthe determination of the vocal features of sarcasm. Utilizing vocal character-istics of emotion—tempo, tempo variation, pitch level, pitch variation,intensity level, intensity variation, resonance, and articulation—researchers(Scherer, 1979; Scherer & Oshinsky, 1977) found that certain emotionalstates were cued most frequently by specific combinations of these vocalcues. For example, “pleasantness” was generally conveyed by a fast tempo,low pitch level, varied pitch, unvaried intensity, a resonant quality, andclipped articulation; whereas “disgust” was conveyed by a slow tempo, lowpitch level, unvaried pitch, a shrill or blaring quality, and drawled or slurredarticulation; “contempt” was found to be conveyed by increased intensity.In similar research, “contempt” was best expressed by varied intensity(Costanzo, Markel, & Costanzo, 1969) and “disgust” was conveyed by lowpitch level (Ohala, 1983; Scherer, 1974). Unfortunately, findings regardingvocal correlates of emotion are often inconsistent across studies (Frick,1985; Scherer & Oshinsky, 1977).

Because sarcasm generally involves a negative speaker attitude, onemight argue that the vocal features associated with contempt and disgustmay be similar to those of sarcasm. Likewise, one might expect nonsar-castic utterances to exhibit vocal features similar to utterances expressingpleasantness.

METHODOLOGICAL PROBLEMS OF SARCASM RESEARCH

Devising an appropriate methodology for testing vocal cues to sarcasmpresents several problems. As vocal features of irony or sarcasm have notyet been tested, no models could be located that might indicate how listen-ers utilize vocal cues to perceive sarcasm.

Most irony research follows a typical pattern. Participants are tested bywritten means, usually a questionnaire that describes various situations inwhich the outcome may or may not be ironic. Obviously, vocal featurescannot be presented in a written test; live or recorded voices must be used.Although a few studies have used oral stimuli, this has not generally beendone out of choice, but out of necessity, because most of these studies havetested young children (Ackerman, 1983; McDevitt & Carroll, 1988).

Unfortunately, researchers who have presented ironic stimuli orally(Ackerman, 1983; McDevitt & Carroll, 1988; Winner et al., 1988) haveshown little concern for realistic or precise presentation of vocal cues. Forexample, Winner et al. (1988) told stimulus speakers to produce vocal cuesusing either a “flat intonation” for ironic messages or “a sincere voice” fornonironic messages (p. 59). Ackerman (1983) had nonironic speakers read

486 Rockwell

Page 5: Lower, Slower, Louder: Vocal Cues of Sarcasm

with “appropriate stressed intonation” and ironic speakers read with“unstressed and neutral intonation” (p. 489). To his credit, Ackerman ran amanipulation check of his stimulus utterances on a group of judges to deter-mine if the vocal features he hoped to elicit were truly evident in the utter-ances. However, in most irony experiments that examine irony orally,utterances are encoded by only several speakers with little or no trainingand with little or no experimenter control. These examples represent theminimal extent to which most researchers have gone to present vocally real-istic ironic utterances.

Another methodological problem in studying sarcastic vocal behaviorinvolves the procedure for eliciting specific vocal characteristics from speak-ers. Some nonverbal researchers contend that if a vocal concept of interest,in this case sarcasm, is explained to speakers, it will bias speakers’ naturalresponses by forcing them to “act” their responses (Bugental, 1974).Therefore, some researchers argue that both a posed and a spontaneous con-dition should be used—one which elicits the target response from thespeaker naturally and one in which the speaker is directly asked to producethe target response. This type of design has been tried with other nonverbalstudies (Zuckerman, Hall, DeFrank, & Rosenthal, 1976; Zuckerman,DeFrank, Hall, Larrance, & Rosenthal, 1979). In these studies, researchersfound that for moderately pleasant or unpleasant emotions, such as interestor distress, posed utterances were decoded more accurately; however, whenextremely pleasant or unpleasant emotions, such as joy or anger, wereexpressed, spontaneous utterances were decoded more accurately. It is uncer-tain, however, whether sarcasm is prompted by moderate or extreme emo-tion.

If researchers can effectively elicit the vocal production of sarcasm,both spontaneously and posed, and if they can determine a valid methodol-ogy to test listeners’ perception of sarcasm, then an answer to one part ofthe irony puzzle may be at hand. It is possible that vocal behavior may rep-resent a major cue in signaling sarcasm, making contextual verbal informa-tion less necessary. With this in mind, the following hypotheses andresearch question are presented:

H1: Listeners will be able to discriminate between nonsarcasm andsarcasm using only vocal features as cues.

RQ1: Will spontaneous and posed sarcasm be perceived equally sar-castic by listeners using only vocal features as cues?

H2: Sarcastic utterances will exhibit slower tempos than nonsarcas-tic utterances.

H3: Sarcastic utterances will exhibit greater intensity than nonsar-castic utterances.

Sarcasm 487

Page 6: Lower, Slower, Louder: Vocal Cues of Sarcasm

H4: Sarcastic utterances will exhibit greater intensity variation thannonsarcastic utterances.

H5: Sarcastic utterances will exhibit lower pitch levels than nonsar-castic utterances.

H6: Sarcastic utterances will exhibit less pitch variation than nonsar-castic utterances.

H7: Sarcastic utterances will exhibit less resonance than nonsarcasticutterances.

H8: Sarcastic utterances will exhibit less precise articulation than non-sarcastic utterances.

METHOD

Stimulus Speakers

Twelve speakers (males = 6, females = 6) were recruited to encode theutterances for the study. Six were professional radio announcers, two werefaculty members with professional acting experience, and four were experi-enced members of the university speech team who had competed at numer-ous forensics tournaments. All speakers were volunteers. It was hoped thatthese participants would exhibit more expressive voices than student volun-teers drawn from classes.

Procedure and Materials

Speakers reported individually to the university radio station where theywere recorded in a sound-proof room using professional recording equip-ment. Intensity levels were maintained across speakers and all speakerswere directed to speak at a precise distance from the microphone. Eachspeaker was asked to read three cards, presented one at a time, with each cardcontaining a target utterance. Speakers were allowed time to study each cardbefore reading it aloud. One card presented a situation designed to elicit astandard reading (nonsarcasm):

In the last year, Credible Computers had become a major player in the field ofcomputer graphics. Much of the dramatic change was due to the efforts of JohnMcKnight, a young man recently graduated from college. John had just won sev-eral major contracts for the firm and it looked as if everyone in the companywould benefit from his expertise. Two company employees, Phil and Don, werediscussing the company’s improved fortunes on their break.

“Things are sure looking good for us these days thanks to John!” said Don.“Right!” responded Phil, “John sure is smart. What a guy!”“He’s sure changed my life!” said Don.The two continued discussing the company’s future for the rest of their break.

488 Rockwell

Page 7: Lower, Slower, Louder: Vocal Cues of Sarcasm

A second card presented a similar situation designed to elicit a sarcastic read-ing (spontaneous sarcasm):

In the last year, Credible Computers had been losing money and influence. Muchof the dramatic change was due to the efforts of John McKnight, a young manrecently graduated from college. John had just lost several major contracts forthe firm due to his inept handling of clients, and it looked as if everyone in thefirm would suffer from his mishandling of these clients. Two company employ-ees, Phil and Don were discussing the company’s dwindling fortunes on theirbreak.

“Things are sure looking bad for us these days thanks to John!” said Don.“Right!” responded Phil, “John sure is smart. What a guy!”“He’s sure changed my life!” said Don.The two continued discussing the company’s future for the rest of their break.

Card #3 contained only the target utterance, with directions to read the utter-ance “sarcastically” (posed sarcasm):

“John sure is smart. What a guy!”

The first two cards were alternated between speakers but the third card,which revealed the nature of the study, was always presented last to avoidbias. A total of three vignettes were used, but each speaker read cards whichpresented only one vignette. In addition to Vignette #1 indicated above, twoother vignettes were used. Vignette #2 included the target utterance: “Lookat the babies. They are so adorable!” Vignette #3 included the target utter-ance: “What a beautiful day. I’m glad to be outdoors!” All three targetutterances were intentionally designed to consist of two sentences (to allowfor sentence breaks), to be declarative rather than interrogative, to expressthe feelings or emotions of the speaking character in the vignette, toexpress an evaluation of a third person(s) or event, and to scrupulouslyavoid words or phrases that might be perceived as inherently sarcastic.

Stimulus Material

A master tape consisting of all target utterances (12 speakers × 3 cards= 36 utterances) was prepared with speakers and conditions presented in ran-dom order. In order to prevent the verbal content from influencing listeners,the utterances were content-filtered at a professional recording studio. Twoindependent monosignals were simultaneously recorded, which distorted thepitch level one octave above and one octave below the base signal. This pro-duced a natural-sounding utterance, which was difficult, if not impossible, tounderstand. When the final version of the master tape was produced, eachutterance was preceded by an announcement which gave the number of the

Sarcasm 489

Page 8: Lower, Slower, Louder: Vocal Cues of Sarcasm

forthcoming utterance (e.g., “Speaker Number One”) followed by a shortpause. These announcements were not content-filtered so that listeners couldkeep track of the number of the utterance they were hearing.

Vocal Coding

Four communication majors (two seniors, two graduate students) codedthe vocal features of the 36 target utterances. Coders received extensivetraining regarding the characteristics of the vocal features of interest. Theypracticed rating unrelated vocal samples until interrater reliabilities weredeemed adequate. Coders worked separately, using an audio tape of allutterances. They completed a questionnaire for each utterance comprised ofseven Likert-type bipolar scales (1 = low and 5 = high), which evaluatedeach utterance as follows [with accompanying interrater reliability coeffi-cients as derived from the Spearman–Brown effective reliability formula(Rosenthal, 1985)]: slow tempo/fast tempo (.96), low pitch/high pitch (.95),unvaried pitch/varied pitch (.62), low intensity/high intensity (.84),unvaried intensity/varied intensity (.75), nonresonant/resonant, (.90), andslurred/articulate (.75).

Participants

Participants (n = 127) consisted of students in six undergraduate commu-nication classes who responded to the master audiotape during a class period.Students were told that the study was voluntary and anyone who did not wishto participate was allowed to leave. To increase motivation, it was announcedthat the individual who accurately identified the most utterances would receivea $50.00 prize. This prize was awarded at the conclusion of data collection.

Procedure

Participants were informed of the nature of the study. Sarcasm wasdefined and examples given. Participants were told that often when speak-ers are sarcastic, their voices indicate this. Participants were told that theywere being asked to determine how sarcastic a particular speaker was by thesound of the voice. The master tape of the 36 target utterances was thenplayed for the participants. Participants completed the sarcasm question-naire while listening to the tape. When necessary, the tape was paused sothat participants would have sufficient time to complete their ratings. Thequestionnaire asked participants to rate the amount of sarcasm for each ofthe 36 utterances on a 5-point Likert-type scale (1 = “speaker is not at allsarcastic,” to 5 = “speaker is very sarcastic”).

490 Rockwell

Page 9: Lower, Slower, Louder: Vocal Cues of Sarcasm

RESULTS

Initially, analysis of variance tests were conducted with sarcasm (non-sarcasm, spontaneous sarcasm, and posed sarcasm) serving as a repeatedmeasure, and vignette (Vignette No. 1, Vignette No. 2, and Vignette No. 3)and order (presented first, presented second, and presented third) serving asbetween-subjects measures in a 3 × 3 × 3 design. The sarcasm scoresderived from the participants’ responses to the utterances (127 participants× 36 utterances = 4572 responses) served as the dependent measure. A sig-nificant main effect for sarcasm condition, F (2,4549) = 41.73, p < .001 wasfound. Planned comparisons between the three sarcasm conditions werethen conducted, yielding a significant difference between the nonsarcasmcondition and the posed sarcasm condition, t (1,4549) = −3.55, p < .001, butnot between the nonsarcasm condition and the spontaneous sarcasm condi-tion, t (1,4549) = −1.13, p < .26. An examination of the means indicatedthat posed sarcasm was perceived as more sarcastic than nonsarcasm (M ofnonsarcasm = 2.96, M of spontaneous sarcasm = 2.91, and M of posed sar-casm = 3.34). However, spontaneous sarcasm was perceived as somewhatless sarcastic than nonsarcasm. These results provide only partial supportfor H1, which posited that listeners would be able to discriminate betweensarcasm and nonsarcasm, but provide a negative answer to RQ1, whichasked if listeners would perceive spontaneous sarcasm and posed sarcasm asequally sarcastic.

An additional examination of the F scores indicated that the ordercondition was also significantly different among the three levels of sar-casm, F (2,4549) = 11.77, p < .001. Comparisons of the means for orderindicated that the third condition (presented third) was significantly differ-ent from both the first condition, t (1,4549) = −7.36, p < .001 and from thesecond condition, t (1,4549) = 8.34, p < .001. This outcome was to beexpected in that the posed condition was intentionally always presentedthird (last) to prevent speaker bias while enacting the other conditions. Inthe more important comparison between the first- and second-order con-ditions, no significant differences were found between the means, t (1,4549) =.98, p < .33. This result indicates that the order of presentation of the first twocards did not significantly alter the speakers’ encoding of these two condi-tions.

No significant differences were found for vignette, F (2,4549) = .469,p < .63. This finding suggests that the three vignettes were perceived asrelatively equal in amount of sarcasm across conditions.

Analysis of variance tests were then conducted using sarcasm condi-tion (nonsarcasm, spontaneous sarcasm, and posed sarcasm) as the inde-pendent variable and the means of the seven coded vocal variables for all

Sarcasm 491

Page 10: Lower, Slower, Louder: Vocal Cues of Sarcasm

utterances (n = 36) as the dependent measures, with planned comparisonsconducted on those variables found to be significant (Table I). From thesetests it was determined that tempo, F (2,33) = 3.58, p < .04, was signifi-cantly different in the three conditions. Planned comparisons of the meansof the three levels of this variable conducted to answer H2, indicated thattempo was significantly different between the nonsarcastic condition andthe combined spontaneous and posed sarcasm condition, t (1,33) = 12.60, p < .001, but not between the spontaneous and the posed conditions, t (1,33)= 1.27, p < .18.

Tests on the intensity variables produced significant results for inten-sity level, F (2,33) = 3.72, p < .04, but not for intensity variation, F (2,33) =.07, p < .93. Planned comparisons on intensity level designed to answer H4were conducted and yielded a significant difference between the com-bined sarcasm condition and the nonsarcasm condition, t (1,33) = 12.25,p < .001, and between the spontaneous and the posed sarcasm conditions,t (1,33) = −2.05, p < .05.

Analysis of the pitch variables produced significant results for pitchlevel, F (2,33) = 3.96, p < .03, but not for pitch variation, F (2,33) = .10,p < .90. Planned comparisons on pitch level designed to answer H6 wereconducted and yielded a significant difference between the nonsarcasm con-dition and the combined sarcasm condition, t (1,33) = 10.90, p < .001. Nosignificant differences were found between the spontaneous and the posedconditions, t (1,33) = 1.47, p < .15.

Tests on resonance did not produce significant results, F (2,33) = .49,p < .61. Tests on articulation, F (2,33) = .39, p < .67, also produced nosignificant results.

DISCUSSION

This study shows that listeners can decode posed sarcasm from vocalcues alone. As researchers had previously found that posed emotions weredecoded more accurately than spontaneously produced emotions when theinitiating emotion was moderate, the finding in the present study of highersarcasm scores for posed sarcasm than for nonsarcasm seems to imply thatsarcasm may be prompted by moderate rather than intense emotion. Forexample, if an individual experiences a negative emotion such as dislikingsomeone, and the dislike is only moderate, the individual might express thisdislike using sarcasm; if, however, the dislike for the individual is muchgreater, this might be expressed in a form much stronger than sarcasm, suchas a direct insult or an expletive-filled outburst. It is possible that sponta-

492 Rockwell

Page 11: Lower, Slower, Louder: Vocal Cues of Sarcasm

neous sarcasm may convey such a moderate emotion, making spontaneoussarcasm undetectable from nonsarcasm in vocal presentation.

The vocal expression of sarcasm appears to exhibit a clear pattern ofvocal cues. The results from the present study indicate that sarcasm is con-veyed by a slower tempo, lower pitch level, and greater intensity than non-sarcasm. It was surprising that no effects were found in this study forresonance or articulation because these variables were exhibited in otherstudies for the emotion of disgust. Possibly sarcasm is not sufficientlyintense to cause changes in these two variables. Also, it is possible that theexperienced speakers used in this study exhibited consistently high levels ofresonance and articulation due to their background and training, which aver-age speakers might not have been able to do.

As no differences were found between the spontaneous and posed con-ditions for either tempo or pitch, one can infer that tempo and pitch levelsare similar for both types of sarcasm. Intensity level, however, producedsignificant differences between spontaneous and posed sarcasm with theposed version, producing a significantly louder utterance than the sponta-neous version. Intensity has been a difficult variable to pin down because ofvarious technological problems associated with its measurement. In the pre-sent study, extreme care was taken to ensure that intensity was measuredconsistently across subjects and conditions, therefore suggesting that thefindings represent a valid depiction of differences in intensity that occurbetween various degrees of sarcasm.

It is interesting to note that the three vocal variables found as cues tosarcasm in this study all involved a overall change of level—not variationsover the course of an utterance, as had been found in some emotion stud-ies. Just what this implies is unclear. Possibly, an overall change in level ofa vocal feature is more likely to occur when the inducing emotion is mod-erate rather than strong. That is, it may require a greater amount emotion tocause a speaker to vary a vocal feature within an utterance than it does tocause the speaker to increase or decrease the overall level of the feature. Asvocal variation within an utterance implies a certain amount of control, itseems likely that a speaker experiencing a strong emotion will be moremotivated to make vocal changes that will reflect the strength of the expe-rienced emotion than will a speaker experiencing moderate emotion.

As verbal features played no part in this study, the findings must tem-per those of previous irony studies, which have argued that contextualverbal information is required for the perception of irony. This verbal infor-mation simply does not appear to be as necessary, at least for posed sar-casm, which is conveyed by a voice that is lower, slower, and louder thannonsarcasm and listeners seem to get the message.

Sarcasm 493

Page 12: Lower, Slower, Louder: Vocal Cues of Sarcasm

494 Rockwell

Table I. Means of Vocal Features of Sarcastic Utterances

Condition

Spontaneous PosedVocal feature Nonsarcasm sarcasm sarcasm F p

TempoM 3.63 3.42 2.75 3.58 .04a

SD .53 .70 .89Intensity

M 3.21 3.29 3.88 3.72 .04a

SD .54 .75 .64Intensity variation

M 2.78 2.19 2.28 .07 .93SD .63 .70 .57

PitchM 3.08 2.92 2.88 3.96 .03a

SD .82 .67 .71Pitch variation

M 2.92 2.21 2.25 .10 .90SD .75 .75 .75

ResonanceM 2.88 2.75 3.00 .49 .61SD .74 .72 .88

ArticulationM 2.67 2.42 2.38 .39 .67SD 1.05 .84 .68

ap significant at .05 level.

REFERENCES

Ackerman, B. P. (1983). Form and function in children’s understanding of ironic utterances.Journal of Experimental Child Psychology, 35,487–508.

Argyle, M., Alkema, F., & Gilmour, R. (1971). The communication of friendly and hostile atti-tudes by verbal and nonverbal signals. European Journal of Social Psychology, 1,385–402.

American heritage dictionary(2nd edn.). (1991). Boston, Massachusetts: Houghton Mifflin.Brown, P., & Levinson, S. C. (1987). Politeness: Some universals in language usage.Cambridge,

England: Cambridge University Press.Buck, R. (1984). The communication of emotion.New York: Guilford Press.Bugental, D. E. (1974). Interpretation of naturally occurring discrepancies between words and

intonation: Modes of inconsistency resolution. Journal of Personality and SocialPsychology, 30,125–133.

Clark, H. H., & Gerrig, R. J. (1984). On the pretense theory of irony. Journal of ExperimentalPsychology, 113,121–126.

Costanzo, F. S., Markel, N. N., & Costanzo, R. R. (1969). Voice quality profile and perceivedemotion. Journal of Counseling Psychology, 16,267–270.

Frick, R. W. (1985). Communicating emotion: The role of prosodic features. PsychologicalBulletin, 97,412–429.

Page 13: Lower, Slower, Louder: Vocal Cues of Sarcasm

Grice, H. P. (1975). Logic and conversation. In P. Cole & J. L. Morgan (Eds.), Syntax andsemantics: Vol. 3. Speech acts(pp. 41–58). New York: Academic Press.

Jorgensen, J., Miller, G. A., & Sperber, D. (1984). Test of the mention theory of irony. Journalof Experimental Psychology, 113,112–120.

Krauss, R. M., Apple, W., Morency, N., Wenzel, C., & Winton, W. (1981). Verbal, vocal, andvisible factors in judgments of another’s affect. Journal of Personality and SocialPsychology, 40,312–319.

Kreuz, R. J., & Glucksberg, S. (1989). How to be sarcastic: The echoic reminder theory of ver-bal irony. Journal of Experimental Psychology, 118,374–386.

Levinson, S. C. (1983). Pragmatics.Cambridge: Cambridge University Press.McDevitt, T. M., & Carroll, M. (1988). Are you trying to trick me? Some social influences on

children’s responses to problematic messages. Merrill-Palmer Quarterly, 34,a 131–145.Mehrabian, A., & Wiener, M. (1967). Decoding of inconsistent communications. Journal of

Personality and Social Psychology, 6,108–114.Ohala, J. J. (1983). Cross-language use of pitch: An ethological view. Phonetica, 40,1–18.Preminger, A. (1974). (Ed.) Princeton encyclopedia of poetry and poetics.London: Macmillan.Rosenthal, R. (1985). Conducting judgment studies. In K. R. Scherer and P. Ekman (Bds.),

Handbook of methods in nonverbal behavior research(pp. 287–361). Cambridge:Cambridge University Press.

Scherer, K. R. (1974). Acoustic concomitants of emotional dimensions: Judging affect fromsynthesized tone sequences. In S. Weitz (Ed.), Non-verbal communication(pp. 105–111).New York: Oxford University Press.

Scherer, K. R. (1979). Nonlinguistic vocal indicators of emotion and psychopathology. In C. E.Izard (Ed.), Emotions in personality and psychopathology(pp. 495–529). New York:Plenum.

Scherer, K. R., & Oshinsky, J. S. (1977). Cue utilization in emotion attribution from auditorystimuli. Motivation and Emotion, 1,331–346.

Sigelman, C. K., & Davis, P. J. (1978). Making good impressions in job interviews: Verbal andnonverbal predictors. Education and Training of the Mentally Retarded, 13,71–77.

Sperber, D. (1984). Verbal irony: Pretense or echoic mention? Journal of ExperimentalPsychology, 113,130–136.

Sperber, D., & Wilson, D. (1986). Relevance: Communication and cognition.Cambridge,Massachusetts: Harvard University Press.

Winner, E., Levy, J., Kaplan, J., & Rosenblatt, E. (1988). Children’s understanding of nonlit-eral language. Journal of Aesthetic Education, 22,51–63.

Zuckerman, M., DeFrank, R. S., Hall, J. A., Larrance, D. T., & Rosenthal, R. (1979). Facialand vocal cues of deception and honesty. Journal of Experimental Social Psychology, 15,378–396.

Zuckerman, M., Hall, J. A., DeFrank, R. S., & Rosenthal, R. (1976). Encoding and decodingof spontaneous and posed facial expressions. Journal of Personality and SocialPsychology, 34,966–977.

Sarcasm 495