Adaptive Parser-Centric Text Normalization
-
Upload
yunyao-li -
Category
Technology
-
view
505 -
download
2
description
Transcript of Adaptive Parser-Centric Text Normalization
1
Adaptive Parser-Centric
Text Normalization
Congle Zhang* Tyler Baldwin** Howard Ho** Benny Kimelfeld** Yunyao Li**
* University of Washington **IBM Research - Almaden
Public Text
Web Text
Private Text
TextAnalytics
MarketingFinancial investmentDrug discoveryLaw enforcement…
Applications
Social media
News
SEC
InternalData
SubscriptionData
USPTO
Text analytics is the key for discovering hidden value from text
DREAM
REALITY
Image from http://samasource.org
CAN YOU READ THIS IN FIRST ATEMPT?
ay woundent of see ’ em
CAN YOU READ THIS IN FIRST ATEMPT?
00:0000:0100:02
I would not have seen them.
When a machine reads it
Results from Google translation
Chinese 唉看见他们woundent
Spanish ay woundent de verlas
Japanese ローマ法王進呈の AY woundent
Portuguese
ay woundent de vê-los
German ay woundent de voir 'em
Text Normalization• Informal writing standard written
form
9
I would not have seen them .
normalize
ay woundent of see ’ em
Challenge: Grammar
10
text normalization
would not of see them
ay woundent of see ’ em
I would not have seen them. Vs.
mapping out-of-vocabulary non-standard tokens to their in-vocabulary standard form
≠
Challenge: Domain Adaptation
Tailor the same text normalization solution towards different writing style of different data sources
11
Challenge: Evaluation• Previous: word error rate & BLEU score
• However,– Words are not equally important – non-word information (punctuations,
capitalization) can be important– Word reordering is important
• How does the normalization actually impact the downstream applications?
12
Adaptive Parser-Centric Text Normalization
GrammaticalSentence
Domain Transferrable
Parsing performance
Outlines• Model• Inference• Learning• Instantiation• Evaluation• Conclusion
14
Model: Replacement Generator
15
• Replacement <i,j,s>: replace tokens xi … xj-1 with s
• Domain customization– Generic (cross-domain) replacements– Domain-specific replacements
Ay1 woudent2 of3 see4 ‘em5
<2,3,”would not”><1,2,”Ay”><1,2,”I”><1,2,ε>
<6,6,”.”>…
EditSameEditDeleteInsert
…
Model: Boolean Variables• Associate a unique Boolean
variable Xr with each replacement r
– Xr =true: replacement r is used to produce the output sentence
16
<2,3,”would not”> = true
… would not …
Model: Normalization Graph
17
• A graphical model Ay woudent of see ‘em
<4,6,”see him”>
<1,2,”Ay”> <1,2,”I”>
<2,4,”would not have”> <2,3,”would”>
<4,5,”seen”>
<5,6,”them”>
*START*
*END*
<6,6,”.”>
<3,4,”of”>
Model: Legal Assignment• Sound
– Any two true replacements do not overlap
– <1,2,”Ay”> and <1,2,”I”> cannot be both true
• Completeness– Every input token is captured by at least
one true replacement18
Model: Legal = Path• A legal assignment: a path from start
to end
19
<4,6,”see him”>
<1,2,”Ay”> <1,2,”I”>
<2,4,”would not have”> <2,3,”would”>
<4,5,”seen”>
<5,6,”them”>
*START*
*END*
<6,6,”.”>
<3,4,”of”>
I would not have see him.
Output
Model: Assignment Probability
20
• Log-linear model; feature functions on edges
20
<4,6,”see him”>
<1,2,”Ay”> <1,2,”I”>
<2,4,”would not have”> <2,3,”would”>
<4,5,”seen”>
<5,6,”them”>
*START*
*END*
<6,6,”.”>
<3,4,”of”>
Outlines• Model• Inference• Learning• Instantiation• Evaluation• Conclusion
21
Inference• Select the assignment with the highest
probability
• Computationally hard on general graph models …
• But, in our model it boils down to finding the longest path in a weighted and directed acyclic graph
22
Inference
23
• weighted longest path
<4,6,”see him”>
<1,2,”Ay”> <1,2,”I”>
<2,4,”would not have”> <2,3,”would”>
<4,5,”seen”>
<5,6,”them”>
*START*
*END*
<6,6,”.”>
<3,4,”of”>
I would not have see him.
Outlines• Model• Inference• Learning• Instantiation• Evaluation• Conclusion
24
Learning
• Perceptron-style algorithm– Update weights by– Comparing (1) most probable output with
the current weights (2) gold sequence25
(1) Informal: Ay woudent of see ‘em(2) Gold: I would not have seen them.(3) Graph
Input
Output (1) weights of features
Learning: Gold vs. Inferred
26
<4,6,”see him”>
<1,2,”Ay”> <1,2,”I”>
<2,4,”would not have”> <2,3,”would”>
<4,5,”seen”>
<5,6,”them”>
*START*
*END*
<6,6,”.”>
<3,4,”of”>
Gold sequence
Most probable sequence with current θ
Learning: Update Weights on the Differential Edges
27
<4,6,”see him”>
<1,2,”Ay”> <1,2,”I”>
<2,4,”would not have”> <2,3,”would”>
<4,5,”seen”>
<5,6,”them”>
*START*
*END*
<6,6,”.”>
<3,4,”of”>
the gold sequence becomes “longer”
Increase wi
Outlines• Model• Inference• Learning• Instantiation• Evaluation• Conclusion
28
Instantiation: Replacement Generators
29
Generator From To
leave intact good good
edit distance bac back
lowercase NEED need
capitalize it It
Google spell dispaear disappear
contraction wouldn’t would not
slang language ima I am going to
insert punctuation ε .
duplicated punctuation
!? !
delete filler lmao ε
Instantiation: Features• N-gram
– Frequency of the phrases induced by an edge
• Part-of-speech– Encourage certain behavior, such as
avoiding the deletion of noun phrases.• Positional
– Capitalize words after stop punctuations• Lineage
– Which generator spawned the replacement30
Outlines• Model• Inference• Learning• Instantiation• Evaluation• Conclusion
31
Evaluation Metrics: Compare Parses
Input sentence
32
Human Expert
Gold sentence
Normalized sentence
Normalizer
Parser
Parser
Compare
Gold Parse
Normalized Parse
Focus on subjects, verbs, and objects (SVO)
Evaluation Metrics: ExampleTest Gold SVO
I kinda wanna get ipad NEW
I kind of want to get a
new iPad.
verb(get) verb(want)verb(get)
precisionv = 1/1
recallv = 1/2
subj(get,I)subj(get,wanna
)obj(get,NEW)
subj(want, I)subj(get,I)obj(get,iPad)
precisionso = 1/3
recallso= 1/333
Evaluation: Baselines• w/oN: without normalization
• Google: Google spell checker
• w2wN: word-to-word normalization [Han and Baldwin 2011]
• Gw2wN: gold standard for word-to-word normalizations of previous work (whenever available).
34
Evaluation: Domains
• Twitter [Han and Baldwin 2011]
– Gold: Grammatical sentences
• SMS [Choudhury et al 2007]
– Gold: Grammatical sentences
• Call-Center Log: proprietary– Text-based responses about users’
experience with a call-center for a major company
– Gold: Grammatical sentences35
Evaluation: Twitter
36
• Twitter-specific replacement generators– Hashtags (#), ats (@), and retweets (RT)– Generators that allowed for either the initial
symbol or the entire token to be deleted
Evaluation: TwitterSystem
Verb Subject-Object
Pre Rec F1 Pre Rec F1
w/oN 83.7 68.1 75.1 31.7 38.6 34.8
Google 88.9 78.8 83.5 36.1 46.3 40.6
w2wN 87.5 81.5 84.4 44.5 58.9 50.7
Gw2wN 89.8 83.8 86.7 46.9 61.0 53.0
generic 91.7 88.9 90.3 53.6 70.2 60.8
domain specific
95.3 88.7 91.9 72.5 76.3 74.4
37
Domain-specific generators yielded the best overall performance
Evaluation: TwitterSystem
Verb Subject-Object
Pre Rec F1 Pre Rec F1
w/oN 83.7 68.1 75.1 31.7 38.6 34.8
Google 88.9 78.8 83.5 36.1 46.3 40.6
w2wN 87.5 81.5 84.4 44.5 58.9 50.7
Gw2wN 89.8 83.8 86.7 46.9 61.0 53.0
generic 91.7 88.9 90.3 53.6 70.2 60.8
domain specific
95.3 88.7 91.9 72.5 76.3 74.4
38
w/o domain-specific generators, our system outperformed the word-to-word normalization approaches
Evaluation: TwitterSystem
Verb Subject-Object
Pre Rec F1 Pre Rec F1
w/oN 83.7 68.1 75.1 31.7 38.6 34.8
Google 88.9 78.8 83.5 36.1 46.3 40.6
w2wN 87.5 81.5 84.4 44.5 58.9 50.7
Gw2wN 89.8 83.8 86.7 46.9 61.0 53.0
generic 91.7 88.9 90.3 53.6 70.2 60.8
domain specific
95.3 88.7 91.9 72.5 76.3 74.4
39
Even perfect word-to-word normalization is not good enough!
Evaluation: SMS
40
SMS-specific replacement generator:- Mapping
dictionary of SMS abbreviations
Evaluation: SMS
41
SystemVerb Subject-Object
Pre Rec F1 Pre Rec F1
w/oN 76.4 48.1 59.0 19.5 21.5 20.4
Google 85.1 61.6 71.5 22.4 26.2 24.1
w2wN 78.5 61.5 68.9 29.9 36.0 32.6
Gw2wN 87.6 76.6 81.8 38.0 50.6 43.4
generic 86.5 77.4 81.7 35.5 47.7 40.7
domain specific
88.1 75.0 81.0 41.0 49.5 44.8
Evaluation: Call-Center
42
Call Center-specific generator:- Mapping dictionary
of call center abbreviations (e.g. “rep.”
“representative”)
Evaluation: Call-Center
43
SystemVerb Subject-Object
Pre Rec F1 Pre Rec F1
w/oN 98.5 97.1 97.8 69.2 66.1 67.6
Google 99.2 97.9 98.5 70.5 67.3 68.8
generic 98.9 97.4 98.1 71.3 67.9 69.6
domain specific
99.2 97.4 98.3 87.9 83.1 85.4
Discussion• Domain transfer w/ small amount of
effort is possible
• Performing normalization is indeed beneficial to dependency parsing– Simple word-to-word normalization is not
enough
44
Conclusion• Normalization framework with an eye
toward domain adaptation
• Parser-centric view of normalization
• Our system outperformed competitive baselines over three different domains
• Dataset to spur future research– https://www.cs.washington.edu/node/9091/
45
Team
46