Focus Contrast in Web Harvested Data
Mats RoothLinguistics and CISCornell University
based on joint research with Jonathan Howell
Radio sites
• Hundreds use Everyzing/Ramp technology• Full ASR transcripts often available• Time offset sometimes available• Either URL of audio or RSS feed almost always
available• Not not enough hits for one target on a single
site• A lot or repetitions of same audio• Seemingly less “spontaneous” speech than on
Everyzing
Youtube
• Searchable closed captions, some obtained with ASR and some provided by video author
• Time offset available on hit page and in URL• Youtube player can seek to a time• Transcript of snippet available• Full transcript not available• Not enough data now• Can hope that a lot of indexed spontaneous
speech will become available
Reuters Insider
• Searchable audio based on Everyzing/Ramp
• Full transcripts available
• Player seeks to timestamp
Goals
Assemble large, focused datasets of examples where intonation varies in a way that correlates with syntax, semantics, or pragmatics.
Study correlation between lexical/grammatical/pragmatic context and acoustic realization.
he stayed longer than I did
-er [[ he he stayed x long]2
than [ IF stayed x long ]~2]
[ y stayed x-long ] antecedent clause
[ speaker stayed x-long ] scope of focus
… I should have liked that song a lot more than I did.
[more
x[[should w[ I like that song x well in w]]
than [I like that song x well in w0]]]
I understand even less than I did before
even less [[ I prs understand x much]2
than [I understood x much beforeF] ]~2]
Alternative semantics for focus-er [[ he he stayed x long]2
than [ IF stayed x long ]~2][ y stayed x-long ] antecedent clause[ speaker stayed x-long ] scope of focusSemantics of focus is the set of alternative
propositions of the form ‘y stayed x long’.Licensing condition for focus The proposition
contributed by the antecedent is an element of the alternative set that is distinct from the proposition contributed by the scope.
Givenness/Entailment semantics for focus
[ y stayed x-long ] antecedent clause
[ speaker stayed x-long ] scope of focus
Licensing condition for focus The antecedent entails the union of the alternative set (focus existential closure).
If he stayed d long, then someone stayed d long.
Alternative semantics and givenness semantics are predictive theories of focus licensing, if the antecedent is stipulated.
Almost always, the antecedent for focus in the than-clause is the main clause.
With that hedge, grammar makes a prediction about where focus should go.
Try to correlate this with acoustic signal.
Focus in comparative clauses
Coherent semantic theory about where focus should go
Possibilities are constrained, because the main clause is usually the antecedent for focus interpretation in the comparative clause
On a theoretical basis, we often think we know the correct grammatical analysis of comparative sentences people use, including the features that determine focus
Nice model system for studying contextual conditioning and phonetic realization of contrastive intonation
Automatic harvest procedure
Replicates how a user would interact with website.
curl retrieve information designated by URL
cutmp3 cut audio file given offsets
awk process html
awk, bash
make control
Time for one run retrieving 1000 hits is less than a day.
116 a1135.g.akamai.net
110 hosted-media.podzinger.com
76 media.weei.podzinger.com
58 feeds.wnyc.org
54 media.libsyn.com
51 podcastdownload.npr.org
50 feeds.feedburner.com
39 library.kraftsportsgroup.com
33 www.whiterosesociety.org
24 www.kpbs.org
21 www.podtrac.com
21 media.wrko.podzinger.com
Jonathan Howell
22 081
397991
43 51 57
328
520
1118
154
3734
na
3750
3500
3250
1000
750
500
250
than he did he himself his own for one thing the one thing
Switchboard Everyzing (collected/verified) Everyzing (projected)
Jonathan Howell
Classification experiment
He stayed longer than IF did. s classantecedent: He stayed x long
I should have liked that song a lot more than
I didF. ns class antecedent: I should have liked that song x
much
I understand even less than I did beforeF
I understand even x little ns class
SVM classifier in R statistical environement (e1071 package)
308 acoustic parameters extracted with Praat
91 tokens in cross-validated design
(Several hundred more tokens with similar results.)
1. all parameters3. duration of “I” only4. duration of “I”, duration of “d” closure, formant
difference 40% into “I”
Jonathan Howell
Jonathan Howell
Method suggested by comparatives experiment
Find common grammatical or lexical contexts that trigger representations with different prosodic realization, according to relatively well-understood and well-supported theory.
Correlate the semantic-grammatical categories directly with the speech signal using machine learning.
Don’t worry about phonemic/morphemic categories like the accent types H* and L+H*, or assume they be annotated on the basis of pitch contour.
Fery and Ishihara (2009) Journal of Linguistics 45.3
SOF: Prenuclear
Die meisten unserer Kollegen waren beim Betriebsausflug lässig angezogen. Nur Peter hat eine Krawatte getragen.
Nur Peter hat sogar einen Anzug getragen.
He’s gotta pick someone who is younger than he is, and is definitely more conservative than he is.
[-er [ t is d young than he is d young]]2 and
more [[ t is is d conservativeF]3
than [ heF is d conservative ] ~3 ] ~2
+Generic corpus of focused pronouns
The SVM classifier is good at detecting focused pronouns using local features on pronoun:
Duration of vowel “I” [ai]
Distance between f1 and f2 halfway into vowel “i” [ai]
Method suggested by comparatives experiment
Find common grammatical or lexical contexts that trigger representations with different prosodic realization, according to relatively well-understood and well-supported theory.
Correlate the semantic-grammatical categories directly with the speech signal using machine learning.
Don’t worry about phonemic/morphemic categories like the accent types H* and L+H*, or assume they be annotated on the basis of pitch contour.
Inherently contrastive phrases
in MY opinion ... admits that other things might be true in
other people’s opinionsNEXT Friday ... at end weekly Friday radio programon the TENOR saxophone ... in Jazz program where there is
frequent mention also of the Alto saxophone
1162 of> my life1110 in> my life681 in> my mind377 in> my opinion276 in> my view231 in> my heart217 of> my career199 in> my career183 in> my head146 with> my life146 with> my family141 on> my way
140 of> my mind139 on> my part134 in> my lifetime125 in> my office115 of> my family108 with> my wife106 on> my face106 in> my house99 on> my mind96 over> my head96 in> my family91 for> my family90 in> my face
+ Does general SVM pronoun focus classifier work on SOF tokens?
+ How common is SOF?
[you made a very small amount more than I did]2
[nowF I make muchF more than youF do] ~2
2 is of the form
required form of antecedent: at t speaker makes d-much more than hearer makes
actual: at t hearer makes d-much more than speaker makes
two SOF tokens
You made a very small amount more than I did. Now I make muchF more than youF do.
There is a correlation between the string context and prosody type?
+ Learn information-theoretically
-- two distributions of acoustic pronoun realizations
-- two distributions of trigram contexts that condition them
P( in opinion) =def
P(type 1) P( 〈 in,opinion 〉 | type 1)
P( | type 1) +
P(type 2) P( 〈 in,opinion 〉 | type 2)
P( | type 2)
What don’t we know about Focus realization? Accent type
Claim that English focal accents divide into• Topic (T), contrastive theme, L+H*• Focus (F), H*
What about Anna? Who did she come with?
AnnaT came with MannyF.What about Manny? Who came with him?
AnnaF came with MannyT.
Attempt to make do pragmatically without a T/F distinction in alternative semantics
Michael Wagner (2008). A Compositional Theory of
Contrastive Topics. NELS 28.
Controversy whether there is a categorial phonetic distinction among H*, L*+H, L+H*.
He’s gotta pick someone who is younger than he is, and is definitely more conservative than he is.
[-er [[t is d youngF]5
than [heF is d young] ~5 ]]2 ~4 and
more [[ t is is d conservativeF]3
than [ heF is d conservative ] ~3 ]4 ~2
A. Nenkova, J. Brenier, A. Kothari, S. Calhoun, L. Whitton, D. Beaver, D. Jurafsky To memorize or predict: prominence labeling in conversational speech
Sasha Calhoun. Information Structure and the Prosodic Structure of English: a Probabilistic Relationship. PhD thesis, University of Edinburgh, 2006
Markup and prediction of accented words in Switchboard corpus
Try to do this for pronouns only
Inherently contrastive phrases
in MY opinion ... admits that other things might be true in
other people’s opinionsNEXT Friday ... at end weekly Friday radio programon the TENOR saxophone ... in Jazz program where there is
frequent mention also of the Alto saxophone
There is a correlation between the string context and prosody type?
+ Learn information-theoretically
-- two distributions of acoustic pronoun realizations
-- two distributions of trigram contexts that condition them
There is a correlation between the string context and prosody type?
+ Learn information-theoretically
-- two distributions of acoustic pronoun realizations
-- two distributions of trigram contexts that condition them
What don’t we know about Focus realization? Accent type
Claim that English focal accents divide into• Topic (T), contrastive theme, L+H*• Focus (F), H*
What about Anna? Who did she come with?
AnnaT came with MannyF.What about Manny? Who came with him?
AnnaF came with MannyT.
What don’t we know about Focus realization? Non-anaphoric focus.
Fery and Samek-Lodovici (2007) Language 82.1
[(An AMERICANf farmer) (with a purple CHEVROLET) (was talking to a CANADIANf farmer) (with a purple Chevrolet)]f
What don’t we know about Focus realization? Accent type
Claim that English focal accents divide into• Topic (T), contrastive theme, L+H*• Focus (F), H*
What about Anna? Who did she come with?
AnnaT came with MannyF.What about Manny? Who came with him?
AnnaF came with MannyT.
two SOF tokens
You made a very small amount more than I did. Now I make muchF more than youF do.
He’s gotta pick someone who is younger than he is, and is definitely more conservative than he is.
[-er [ t is d young than he is d young]]2 and
more [[ t is is d conservativeF]3
than [ heF is d conservative ] ~3 ] ~2
Distribution of datasets
Audio snippets can probably by distributed under fair use.
http://confluence.cornell.edu/display/prosody/Prosody+Datasets
• A lot of naturalistic data bearing on theories of
prosody can be found using search engines that index audio using ASR.
• Machine learning classification is a good methodology for prosody, because one can work with semantic-pragmatic categories that figure in formal theories.
• For focus, try to do build classifiers, not just find statistically significant correlations with acoustic parameters. Classifiers such as SVM can combine information from a lot of features.
Top Related