HOWLING WOLVES AND ROARING LIONS: WHAT SPEAKERS THINK AND WHAT A CORPUS TELLS US John Newman &...
-
Upload
michael-quinn -
Category
Documents
-
view
218 -
download
0
Transcript of HOWLING WOLVES AND ROARING LIONS: WHAT SPEAKERS THINK AND WHAT A CORPUS TELLS US John Newman &...
HOWLING WOLVES AND ROARING LIONS: WHAT SPEAKERS THINK AND WHAT A CORPUS TELLS US
John Newman & Tamara Sorenson Duncan
Department of Linguistics
University of Alberta
CSDL-12 Conference, Santa Barbara, 4-6 November 2014
Alternative empirical approaches to understanding linguistic phenomena• Different kinds of data, different kinds of analysis
• e.g., spoken corpus data & collostructional analysis (attraction of a word to a construction); sentence completion task & response time to recognition of a word in a construction
• Different kinds of data, same kind of analysis• e.g., spoken corpus data, elicitation data (sentence completion
task), behavioral profile analysis
• Same data, different analyses• e.g., one data frame analyzed by two or more different statistical
techniques
Cf. Gilquin, Gaëtanelle & Stefan Th. Gries. 2009. Corpora and experimental methods: A state-of-the-art review. Corpus Linguistics and Literary Theory 5(1): 1-26.
Possible outcomes of different approaches to studying linguistic phenomena
• Convergence of outcomes
• Divergence of outcomes
• Not clear (some results converge, some diverge)
This study: syntactic subject preference of ROAR and HOWL
• sentence elicitation task
• adult corpora & measure association strength of subject nouns associated with verbs
• adult corpora & behavioural profile analysis of factors for verbs
• sentence elicitation task
• adult corpora & measure association strength of subject nouns associated with verbs
• adult corpora & behavioural profile analysis of factors for verbs
Elicitation Task
• “isolated sentence production”. 31 Research participants saw a word appear on the screen and were asked to provide a sentence using that word, including ROAR and HOWL).
• 10 target verbs x 3 = 30 target stimuli• 20 distractors x 2 = 40 distractor stimuli
• Only instances where the target word was used as a verb in an active construction were included in the analysis = 594 sentences (range 36-81 sentences/word).
Isolated sentence production vs connected discourse
• less use of passive• I/we more likely as subjects of sentences• Tom1 noticed that he1.. anaphora more than Tom noticed
that Bob…
Roland, Douglas & Daniel Jurafsky. 2002. Verb sense and verb subcategorization probabilities. In Suzanne Stevenson & Paolo Merlo (eds.), The Lexical Basis of Sentence Processing: Formal, computational, and Experimental Issues, 325-346. Amsterdam: John Benjamins.
Subjects of ROAR in Elicited SentencesLION 12CROWD 4HE 3CAT 2CHILD 2I 2MONSTER 2SIMBA 2(YOU) 1ANIMAL 1DOG 1DRAGON 1IT 1JET 1LADY 1NATALIE 1PRINCIPAL 1WAVE 1WE 1YOU 1
Subjects of HOWL in Elicited Sentences
WOLF 14
DOG 11
HE 6
I 6
COYOTE 4
WIND 4
WEREWOLF 3
(YOU) 2
YOU 2
ANIMAL 1
BEAR 1
MAN 1
PEOPLE 1
Non-subject references to [animal] nouns in Elicited Sentences (not included in count of subject types)
• i became a wolf and howled at the moon. AWOOOOOOOOOOOOOHH
• Can more animals roar than just a lion
• Natalie pretended to roar like an animal
• I'm a lion, hear me roar.
• The child likes to roar and pretend he's a lion.
Word associations with ROAR as prompt (Edinburgh Association Thesaurus)
No. of responses
Percentage of total responses
LION 48 0.49NOISE 5 0.05BULL 3 0.03CROWD 3 0.03LOUD 3 0.03SHOUT 3 0.03BELLOW 2 0.02CAR 2 0.02RAGE 2 0.02TIGER 2 0.02WATER 2 0.02
Showing items for responses >1, total no. of responses = 98
Word associations with HOWL as prompt (Edinburgh Association Thesaurus)
Showing items for responses >1, total no. of responses = 100
No. of responses
Percentage of total
responsesWOLF 18 0.18DOG 13 0.13CRY 11 0.11SHOUT 7 0.07YELL 7 0.07OWL 5 0.05SCREAM 5 0.05NOISE 3 0.03GROAN 2 0.02LAUGH 2 0.02LAUGHTER 2 0.02NIGHT 2 0.02SHRIEK 2 0.02
Learning about roaring lions• Amazon CHILDREN’S BOOKS category search on “lion+roar”
Baby-2year old:Lion Cub Roars - (Baby Animals Book)I Can Roar Like a Lion
3-5 year old:When Lions RoarGoogly Eyes: Leo Lion's Noisy Roar!Simon Says Roar like a LionThe Happy Lion RoarsThe Lion Who Couldn't Roar Can You Roar Like a Lion? (Pull-N-Slide Books)etc etc etc
Learning about roaring lions
Pop culture
• sentence elicitation task
• adult corpora & measure association strength of subject nouns associated with verbs
• adult corpora & behavioural profile analysis of factors for verbs
Who (or what) roars and howls in COCA?
proxy subject of a ROAR verb
COCA subjects of ROAR by frequency
SPOKEN Freq [HURRICANE] 16 [ECONOMY] 11 [KATRINA] 11 [FIRE] 8 [FLAME] 7 [CROWD] 7 [MOUSE] 6 [MARKET] 6 [LION] 6 [WIND] 6 [WALL] 5 [TRAIN] 5 [STOCK] 5
WRITTEN Freq [CROWD] 177 [ENGINE] 155 [CAR] 112 [FIRE] 76 [WIND] 67 [TRUCK] 45 [LION] 45 [AUDIENCE] 40 [TRAIN] 39 [PLANE] 33 [HEAD] 33
ALL Freq [CROWD] 183 [ENGINE] 158 [CAR] 116 [FIRE] 84 [WIND] 71 [LION] 51 [TRUCK] 46 [TRAIN] 43 [AUDIENCE] 40 [FLAME] 36 [PLANE] 35 [HEAD] 33 [HURRICANE] 33
COCA subjects of ROAR by frequency
FICTION Freq [ENGINE] 122 [CAR] 98 [CROWD] 75 [FIRE] 49 [WIND] 40 [TRUCK] 33 [HEAD] 27 [TRAIN] 25 [MAN] 21 [MOTORCYCLE] 21 [FLAME] 21
MAGAZINE Freq
[CROWD] 47 [WIND] 23 [LION] 19 [ENGINE] 17 [FIRE] 14 [AUDIENCE] 13 [ECONOMY] 11 [CAR] 8 [MARKET] 8 [TORNADO] 7 [TRAIN] 7 [TRUCK] 7
NEWSPAPER Freq [CROWD] 47 [ENGINE] 13 [AUDIENCE] 11 [FIRE] 10 [PLANE] 10 [HURRICANE] 9 [JET] 9 [MARKET] 9 [MOTORCYCLE] 7 [STOCK] 7 [HELICOPTER] 7 [FAN] 7
ACADEMIC Freq
[CROWD] 8
[LION] 5
Collostructional Analysis• Use Coll.analysis 3.2a script by Stefan Gries
• Total no. of constructions in corpus = [vv*] in corpus• Total no. of [SUBJ ROAR] constructions = [n*] in L3-L1 of
[vv*], grouped as lemmas (sg + pl)• coll.strength: -log10 (Fisher-Yates exact, one-tailed), the
higher, the stronger
COCA subjects of ROAR by Coll.analysisSPOKEN Coll. Score
[HURRICANE] 25.2[KATRINA] 19.7[FLAME] 12.9[LION] 11.3[MOUSE] 11.3[ECONOMY] 8.2[CROWD] 6.9[FIRE] 6[STOCK] 4.7[MARKET] 4.2[WALL] 4.1
WRITTEN Coll. Score[CROWD] 217.3[ENGINE] 203.2[CAR] 58.5[LION] 53.1[WIND] 46.1[MOTORCYCLE] 43.6[FIRE] 42.8[TRUCK] 32.4[HELICOPTER] 29.7[FLAME] 29.4[BEAST] 28.2
[AUDIENCE] 27.2
[PLANE] 21.7
[TRAIN] 20.7
[KONG] 20.6
[THUNDER] 16.2
[JET] 15.8
[HURRICANE] 15.4
[TORNADO] 14.7
[TIGER] 14.4
[MOTOR] 14.2
ALL Coll. Score
[CROWD] 226.4
[ENGINE] 213.3
[LION] 63.8
[CAR] 60.7
[WIND] 50.9
[MOTORCYCLE] 47.7
[FIRE] 47.7
[FLAME] 39.9
[TRUCK] 33.4
[HURRICANE] 33.3
[HELICOPTER] 29.8
[BEAST] 29.1
[AUDIENCE] 24.6
[TRAIN] 24
[PLANE] 21.2
[KONG] 20.9
[JET] 20.4
[KATRINA] 17.6
[THUNDER] 16.7
[MOTOR] 15.7
COCA subjects of ROAR by Coll.analysisFICTION Coll.
Score
[ENGINE] 180.1
[CROWD] 70.1
[CAR] 52.2
[MOTORCYCLE] 32.1
[KONG] 30.5
[FIRE] 25.1
[HELICOPTER] 24.5
[WIND] 22.7
[TRUCK] 21.6
[BEAST] 19.9
[LION] 18.8
[MOTOR] 18.5
[FLAME] 17.7[TIGER] 17
[TRAIN] 14.5
[TYRANNOSAUR] 12.1
[BUS] 11.6
[THUNDER] 11.3
[AUDIENCE] 11.1
[BATMOBILE] 10.3
MAGAZINE Coll. Score
[CROWD] 65.8
[LION] 27.8
[WIND] 19.1
[ENGINE] 15
[AUDIENCE] 12.4
[TORNADO] 10.8
[FIRE] 8.2
[FLAME] 7.1
[ECONOMY] 6.9
[TRUCK] 5.2
[TRAIN] 3.9
[PLANE] 3.7
[STORM] 3.4
[FAN] 3.2
[SEA] 2
[MARKET] 1.8
[CAR] 1.8
[RIVER] 1.3
[STOCK] 1.2
[AIR] 0.8
NEWSPAPER Coll. Score
[CROWD] 67
[ENGINE] 16
[HURRICANE] 11.1
[MOTORCYCLE] 11
[JET] 9
[PLANE] 8.9
[AUDIENCE] 8.9
[HELICOPTER] 8.5
[MOUSE] 7
[KATRINA] 6.9
[LANE] 5.4
[FIRE] 5
[STORM] 4.2
[TRAIN] 3.4
[FAN] 3.2
[SOUND] 2.9
[SMITH] 2.6
[ECONOMY] 2.6
[STOCK] 2.6
[MARKET] 2.4
ACADEMIC Coll. Score
[CROWD] 14.8[LION] 9.4
Reliance measure• “Reliance” (Hans-Joerg Schmid) • = “Relevance” score in COCA interface• = “Faith(fulness)” in Coll.analysis output
• = (freq in construction/freq.of word in corpus)x100%
• We set min. freq = 5
COCA subjects of ROAR by Reliance
SPOKEN Rel. [LION] 0.52 [MOUSE] 0.51 [FLAME] 0.49 [KATRINA] 0.46 [HURRICANE] 0.29 [CROWD] 0.09 [STOCK] 0.05 [ECONOMY] 0.05 [WALL] 0.04 [FIRE] 0.04[MARKET] 0.04
WRITTEN Rel. [TYRANNOSAUR] 4.26 [BATMOBILE] 2.24 [IVOR] 1.49 [JETLINER] 1.46 [MOTORCYCLE] 0.82 [STAG] 0.81 [ENGINE] 0.62 [CHOPPER] 0.62 [CROWD] 0.51 [HARLEY] 0.48 [LION] 0.44
ALL Rel.
[TYRANNOSAUR] 4.08 [BATMOBILE] 2.18 [IVOR] 1.42 [JETLINER] 1.07 [STAG] 0.77 [MOTORCYCLE] 0.77 [ENGINE] 0.58 [CHOPPER] 0.51 [LION] 0.45 [CROWD] 0.45 [BONFIRE] 0.43 [HARLEY] 0.42 [WILDFIRE] 0.35 [BEAST] 0.35 [FLAME] 0.32
COCA subjects of ROAR by Reliance
FICTION Rel.
[TYRANNOSAUR] 4.88 [BATMOBILE] 2.48 [MOTORCYCLE] 1.91 [KONG] 1.9 [ENGINE] 1.82 [IVOR] 1.73 [HARLEY] 1.04 [HELICOPTER] 0.94 [TIGER] 0.89 [HEATER] 0.88
MAGAZINE Rel.
[TORNADO] 0.69 [LION] 0.67 [CROWD] 0.61 [FLAME] 0.29 [AUDIENCE] 0.19 [ENGINE] 0.17 [WIND] 0.15 [TRUCK] 0.11 [ECONOMY] 0.08 [STORM] 0.08
NEWSPAPER Rel. [MOTORCYCLE] 0.58 [CROWD] 0.49 [MOUSE] 0.37 [KATRINA] 0.34 [ENGINE] 0.29 [HURRICANE] 0.28 [HELICOPTER] 0.25 [JET] 0.16 [PLANE] 0.12 [LANE] 0.11
ACADEMIC Rel.
[LION] 0.31
[CROWD] 0.31
• sentence elicitation task
• adult corpora & measure association strength of subject nouns associated with verbs
• adult corpora & behavioural profile analysis of factors for verbs
Factors and LevelsCases SUBJECT OBJECT TENSE
example1 HUMAN HUMAN PAST
example2 NON_HUMAN INANIMATE PAST
example 3 HUMAN HUMAN PAST
example4 HUMAN HUMAN PRES
example5 INANIMATE HUMAN PAST
example6 HUMAN INANIMATE PAST
example7 HUMAN INANIMATE PAST
example8 HUMAN INANIMATE FUTURE
example9 HUMAN INANIMATE PAST
example10 NON_HUMAN INANIMATE PAST
HOWL and ROAR in 200 samples of COCA WRITTEN
HOWL ROAR animal 34 7human 117 86
inanimate 48 106unknown 1 1
200 200
All [animal] subject examples in 200 sampled ROAR hits
"AAARRRGGGHHH! "roared the angry Bunyip. 2010 FIC Faces
Sea-lions roared on the Lobos Rocks off shore 1999 MAG Smithsonian
pigeons disappeared, many people thought they'd simply roared away somewhere else
1990 MAG Wilderness
Lead Actor # Michael Caine, WWII: When Lions Roared
1994 NEWS USAToday
that great human cat who neither roars nor growls 1997 ACAD ScandinavStud
almost time for the stags to begin roaring 2011 FIC Bk:ColdVengeance
Ahead, uphill, he hears the tiger roar 2001 FIC VirginiaQRev
Some [inanimate] subject examples in 200 sampled ROAR hitsThe car veered suddenly, the engine roaring.
the radio and air conditioner roaring full-blast.
Because Germany and the other north European economies are roaring ahead, flushing tax money into government coffers
pulls him off the roadway as the truck roars by.
when the train roars across above, bodies spilled and still, barely stirring. Although frequently interrupted by the sounds of airplanes roaring overhead, Smaltz patiently tried to explain.
Huddled inside our trusty geodome with the wind roaring so loudly we could barely hear one another speak
[animal] subjects in sampled data • 7/200[animal] hits for ROAR• 34/200 [animal] hits for HOWL
• [animal] is rather unlikely to play a very significant role in any multifactorial analysis of ROAR
• When compared with verbs which have very high proportion of [animal] subjects, [animal] may be significant for ROAR in a negative way
• When compared with verbs which have no (or almost no) [animal] subjects, [animal] probably won’t reach significance as a level with ROAR
Cluster analysis of “yell” verbs based on sampled 200 hits for each verb
inanimate *** (+)human *** (-)animal ns.
Evaluating results of three approaches
• Elicitation• LION-ROAR, WOLF/DOG-HOWL association is dominant• [animal] - ROAR/HOWL association is strong• Grounded in special roles for animals in learning English?
• Corpus: subject words of ROAR• LION-ROAR association is present but not dominant, by frequency• Frequency results vary considerably between sub-corpora• [animal]-ROAR association is strong, by Reliance measure• Grounded in the whim (?) of assembling texts for a corpus and the choice of
association measure
• Corpus: factor analysis of ROAR, HOWL etc.• [animal]-ROAR/HOWL association not significant• LION would not be identified as associated with ROAR• Grounded in whim (?) of assembling texts for a corpus, focus on broad features
rather than specific words, choice of other verbs to compare with
Evaluating results of three approaches
• Elicitation• LION-ROAR, WOLF/DOG-HOWL association is dominant• [animal] ROAR/HOWL association is strong• Grounded in special roles for animals in learning English?
• Corpus: subject words of ROAR• LION-ROAR association is present but not dominant, by frequency• Frequency results vary considerably between sub-corpora• [animal]-ROAR association is strong, by Reliance measure• Grounded in the whim (?) of assembling texts for a corpus and the choice of
association measure
• Corpus: factor analysis of ROAR, HOWL etc.• [animal]-ROAR/HOWL association not significant• LION would not be identified as associated with ROAR• Grounded in whim (?) of assembling texts for a corpus, focus on broad features
rather than specific words, choice of other verbs to compare with
Evaluating results of three approaches
• Elicitation• LION-ROAR, WOLF/DOG-HOWL association is dominant• [animal] ROAR/HOWL association is strong• Grounded in special roles for animals in learning English?
• Corpus: subject words of ROAR• LION-ROAR association is present but not dominant, by frequency• Frequency results vary considerably between sub-corpora• [animal]-ROAR association is strong, by Reliance measure• Grounded in the whim (?) of assembling texts for a corpus and the choice of
association measure
• Corpus: factor analysis of ROAR, HOWL etc.• [animal]-ROAR/HOWL association not significant• LION would not be identified as associated with ROAR• Grounded in whim (?) of assembling texts for a corpus, focus on broad features
rather than specific words, choice of other verbs to compare with
Conclusion• Elicitation of sentences and corpus-based approaches are
grounded in quite different realities – one shouldn’t expect all results to “converge”.
Conclusion• Elicitation of sentences and corpus-based approaches are
grounded in quite different realities – one shouldn’t expect all results to “converge”.
• Both convergent and divergent results can lead to a better understanding of different data and different methods (cf. Kepser & Reis 2005, Arppe & Järvikivi 2007).
Conclusion• Elicitation of sentences and corpus-based approaches are
grounded in quite different realities – one shouldn’t expect all results to “converge”.
• Both convergent and divergent results can lead to a better understanding of different data and different methods (cf. Kepser & Reis 2005, Arppe & Järvikivi 2007).
• When working with corpora, specific words and semantic/syntactic etc. features are of interest.
Conclusion• Elicitation of sentences and corpus-based approaches are
grounded in quite different realities – one shouldn’t expect all results to “converge”.
• Both convergent and divergent results can lead to a better understanding of different data and different methods (cf. Kepser & Reis 2005, Arppe & Järvikivi 2007)
• When working with corpora, specific words and semantic/syntactic etc. features are of interest.
• Behavioral Profile analysis should not deter us from also carrying out a Collostructional Analysis.