An Investigation of Implicatures in Chinese Lingjia Deng, Janyce Wiebe Intelligent Systems Program...
-
Upload
augustine-porter -
Category
Documents
-
view
216 -
download
0
Transcript of An Investigation of Implicatures in Chinese Lingjia Deng, Janyce Wiebe Intelligent Systems Program...
An Investigation ofImplicatures in ChineseLingjia Deng, Janyce Wiebe
Intelligent Systems ProgramDepartment of Computer ScienceUniversity of Pittsburgh
Outline Introduction
Implicature in Chinese
Inference in Chinese
Extracting Chinese GoodFor/BadFor Chinese GoodFor/BadFor Words Syntax of Chinese Agents/Objects Chinese Sentiment Analysis
Conclusions
OutlineIntroduction
Implicature in Chinese
Inference in Chinese
Extracting Chinese GoodFor/BadFor Chinese GoodFor/BadFor Words Syntax of Chinese Agents/Objects Chinese Sentiment Analysis
Conclusions
IntroductionScenario:
The government proposes the bill of Affordable Care Act. We want to analyze everyone’s opinion of it.
We can collect opinions by doing survey, questionnaire, etc.
We can also collect the writer’s stances by analyzing their posts online.
Introduction “The bill will lower the skyrocketing healthcare
costs.”
Explicit (Direct) Sentiment: writer negative toward the skyrocketing healthcare
costs The healthcare cost is too high. I cannot afford it.
Implicit (Inferred) Sentiment: writer positive toward the bill will lower costs
There is a chance that the costs could be decreased! I love it!
writer positive toward the bill The bill is able to do this! I’ll vote for it!
WHAT ABOUT BILL?
GoodFor/BadFor Event“The bill will lower the skyrocketing healthcare
costs.” <bill, lower, healthcare costs>
GoodFor/BadFor Event (Deng et al., ACL 2013 short): goodFor event: help, increase, etc badFor event: lower, destroy, decrease, etc <agent, goodFor/badFor event, object>
GoodFor/BadFor Corpus (Deng et al., ACL 2013 short): 134 political editorials e.g. <bill, lower, healthcare costs> e.g. <positive, badFor, negative> almost 20% sentences have clear goodFor/badFor events available at mpqa.cs.pitt.edu
Benefactive/Malefactive Event
Related Work Words/Phrases directly imply implicit opinions. (Zhang and
Liu, 2011; Feng et al., 2013)
Infer an overall polarity of a sentence by compositional semantics. (Choi and Cardie, 2008; Moilanen et al., 2010)
Identify classes of goodFor/badFor terms, and carry out studies involving artificially constructed goodFor/badFor triples and corpus examples matching fixed linguistic templates. (Anand and Reschke 2010; 2011)
Generate a lexicon of patient polarity verbs, which correspond to goodFor/badFor events whose spans are verbs. (Goyal et al., 2012)
Investigate sarcasm where the writer holds a positive sentiment toward a negative situation. (Riloff et al., 2013)
Our Work of GoodFor/BadFor An annotated goodFor/badFor Corpus. (Deng et al., ACL
2013 short)
A sense-level goodFor/badFor lexicon. (Choi et al., WASSA 2014)
Four inference rule schemas and a graph-based model for sentiment propagation. (Deng and Wiebe, EACL 2014)
An optimization framework for joint sentiment inference and disambiguating goodFor/badFor components. (Deng et al., Coling 2014)
A rule-based framework for representing and analyzing opinion implicatures. (Wiebe and Deng, arXiv 2014; WASSA 2014)
TODAY
TODAY
Motivation For This Work This work is investigation of implicatures in
Chinese.
People speaking different languages may express their opinions in different ways.
Before directly applying goodFor/badFor implicature in English to Chinese, we want to investigate: whether such implicature also exists in Chinese; whether the sentiment inference rules also apply to
Chinese implicit opinions; whether it is feasible to extract goodFor/badFor events
and the corresponding components in Chinese.
OutlineMotivation
Implicature in Chinese Agreement Study
Inference in Chinese
Extracting Chinese GoodFor/BadFor Chinese GoodFor/BadFor Words Syntax of Chinese Agents/Objects Chinese Sentiment Analysis
Conclusions
Implicature in Chinese:Agreement Study An opinion-orientated, paragraph-paralleled
corpus: Chinese version of the New York Times (http://cn.nytimes.com/).
Select the English paragraphs containing English goodFor/badFor words.
Present the parallel Chinese paragraphs.
Implicature in Chinese:Agreement StudyAll the three annotators, including me, are
Chinese graduate students in University of Pittsburgh.
Annotate 60 paragraphs, 253 sentences.
Conduct the agreement study in the same manner with (Deng et al., 2013).
Implicature in Chinese:Agreement StudyTrain with English manual (Deng et al., 2013)
and several Chinese annotated examples.
Annotate: (A). spans of the goodFor/badFor events (B). spans of the agents and objects of the events (C). polarities of the events: goodFor or badFor (D). writer’s sentiments toward the agents and
objects: positive, negative, neutral
Evaluate by the same metrics as (Deng et al., 2013): for (A) & (B): percentage of span both annotate for (C) & (D): kappa
Implicature in Chinese:Agreement Study
overlap(a,b)
(A) goodFor/badFo
r span
(B) agent span (B) object span
Anno 1&2 0.7929 0.9091 0.9091
Anno 1&3 0.7044 0.9524 1.0
kappa (C) goodFor/badFo
r polarity
(D) sentiment toward agent
(D) sentiment toward object
Anno 1&2 0.9385 0.7830 0.7238
Anno 1&3 0.8966 0.5913 0.8478
All the scores are good: trained by the English manual, the annotators are able to detect similar implicature in Chinese.
Scores of (A) and (D) are lower than those in the English goodFor/badFor agreement study ( Deng et al., 2014) .
Implicature in Chinese:Agreement Study For annotating (D) writer’s sentiments, the main
disagreement comes from: Anno 1 annotated as positive or negative Anno 2 annotated as neutral
We conduct a phase-II agreement study on 10 editorials from the English corpus (Deng et al., 2013). Three scores:
I. agreement scores in Chinese by three annotators II. agreement scores in English by three annotators III. previous agreement scores (Deng et al., 2013)
score I = score II; score I < score III; score II < score III They have a similar understanding of implicatures in
the two languages.
Implicature in Chinese:Agreement StudyFor annotating (A) goodFor/badFor events, the
major disagreement comes from: Anno 1 marks a goodFor/badFor span Anno 2 doesn’t mark it because he thinks it
violates the syntax rules we specified in the English manual.
Syntax rules are specified in the English manual to guide the annotators to focus on clear cases of goodFor/badFor events, e.g. The object should be the major semantic object. The goodFor/badFor polarity should be perceived
within the triple.
GoodFor/BadFor Cases Evoked by Chinese SyntaxThe goodFor/badFor polarity should be
perceived within the triple.
It will put the reform to die.
In English: this is NOT annotated as a goodFor/badFor event. put is the verb <it, put, reform>
put X to die: badFor X put X to revive: goodFor X
GoodFor/BadFor Cases Evoked by Chinese SyntaxThe goodFor/badFor polarity should be
perceived within the triple.
It will put the reform to die.
这将把改革置于死地。
In Chinese: this can be represented as a clear goodFor/badFor case “put” is not a verb in the Chinese sentence BA structure (Chao, 1968; Li and Thompson, 1989;
Sybesma, 1992)
subject, BA, object, verb it will BA kill the reform
Implicature in Chinese:Conclusion Such syntax is commonly seen in Chinese.
These goodFor/badFor events due to the Chinese syntax are clear enough in Chinese. It will kill the reform.
In order to fully study the Chinese goodFor/badFor, the manual should be revised to provide guidance to annotate such events.
Overall, similar implicatures can be perceived in English and in Chinese.
Outline Motivation
Implicature in Chinese
Inference in Chinese Graph Model for Sentiment Propagation (Deng
and Wiebe, 2014)
Extracting Chinese GoodFor/BadFor Chinese GoodFor/BadFor Words Syntax of Chinese Agents/Objects Chinese Sentiment Analysis
Conclusions
agentgoodFor/badForobject
EXPLICITSENTIMENTDETECTOR
LOOPYBELIEF
PROPAGATION
agent/object
goodFor
badFor
Graph Model (Deng and Wiebe, 2014)
EncodingInference
Rules
Inference in Chinese: Graph Model PerformanceWe run an isolated evaluation of the graph
model itself (Deng and Wiebe, 2014).
For a node, calculate how many times it is propagated correctly given any neighbor node being assigned with a correct sentiment label.
The scores in Chinese are lower than those in English (89% in (Deng and Wiebe, 2014)). Blocked Inference
Dataset # subgraph
correctness
all subgraph 136 0.7058
multi-node subgraph
61 0.8251
Blocked Inference:In Chinese and English…a misreading which estimated the law would
“reduce the amount of labor …
<law, reduce, labor>
The writer doesn’t believe <law, reduce, labor>. “misreading” believes so.
The writer is negative toward “misreading”.
For events which the writer doesn’t believe it is true, the inference should be blocked. It is not in the writer’s belief space (Wiebe and Deng,
2014).
BADFOR
Inference in Chinese: ConclusionThough there are cases where the inference
rules are blocked, The cases appear both in Chinese and in English. We didn’t find evidence showing that the blocked
inference only occurs in English.
Besides the blocked inferences, the good correctness scores provide evidence that the inference rules also apply to Chinese.
Outline Motivation
Implicature in Chinese
Inference in Chinese
Extract Chinese GoodFor/BadFor Chinese GoodFor/BadFor Words Syntax of Chinese Agents/Objects Chinese Sentiment Analysis
English + Parallel Corpus?
Conclusions
Chinese GoodFor/BadFor Words Given we have an English goodFor/badFor lexicon (Choi et
al., 2014), is it applicable to derive a bilingual goodFor/badFor lexicon from a parallel corpus?
We manually find the parallel spans in English corresponding to the annotated goodFor/badFor spans in the Chinese.
76.25% annotated Chinese goodFor/badFor spans have parallel goodFor/badFor spans in English.
For the other Chinese annotated goodFor/badFor spans, there is no corresponding goodFor/badFor span in English, due to: Chinese syntax; paraphrasing.
Chinese Agent/Object We use the Stanford dependency parser to extract the
agent/object in English (Deng et al., 2014). nsubj-(event, agent) dobj-(event, object)
Can we use the same dependency labels to extract agent/object in Chinese?
We choose the Chinese Stanford dependency parser. Some dependency labels exist both in Chinese and English.
There are more nsubj and dobj in Chinese data than in English data.
Some labels are especially designed for Chinese (Chang et al., 2009). 19.57% in agents, 25.82% in objects. They are similar to some labels in English.
Chinese Sentiment AnalysisSentiment Lexicon:
HowNet NTU Sentiment Dictionary (Ku and Chen, 2007) A sentiment lexicon from Tsinghua University (Li
and Sun, 2007)
Bilingual and Multilingual Chinese Sentiment Analysis Research Wan, 2008; Wan, 2009; Boyd-Graber and Resnik,
2010; Lu et al., 2011; etc.
Chinese Sentiment Analysis Tools LingPipe http://alias-i.com/lingpipe/ Semantria https://semantria.com/
OutlineMotivation
Implicature in Chinese
Inference in Chinese
Extracting Chinese GoodFor/BadFor
Conclusions
Conclusions The implicatures that arise from explicit sentiment
toward goodFor/badFor events exist in Chinese language and they are similar to those in English.
The inference rules we developed for English apply to Chinese.
There are several cases where the inferences are blocked and such cases exist both in Chinese and English.
It is promising to develop systems automatically extracting Chinese goodFor/badFor events using the existing methods for English and leveraging the parallel corpus.
Questions ?Thank Fan Zhang and Changsheng Liu for
annotations. Part of References:
Jordan Boyd-Graber and Philip Resnik. 2010. Holis- tic sentiment analysis across languages: Multilingual supervised latent dirichlet allocation. In Proceedings of the 2010 Conference on Empirical Meth- ods in Natural Language Processing.
Lingjia Deng and Janyce Wiebe. 2014. Sentiment propagation via implicature constraints. In Meeting of the European Chapter of the Association for Computational Linguistics.
Lingjia Deng, Yoonjung Choi, and Janyce Wiebe. 2013. Benefactive/malefactive event and writer attitude annotation. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics.
Lun-Wei Ku and Hsin-Hsi Chen. 2007. Mining opinions from the web: Beyond relevance retrieval. Journal of the American Society for Information Science and Technology.
Jun Li and Maosong Sun. 2007. Experimental study on sentiment classification of chinese review using machine learning techniques. In Natural Language Processing and Knowledge Engineering, 2007.
Bin Lu, Chenhao Tan, Claire Cardie, and Benjamin K Tsou. 2011. Joint bilingual sentiment classification with unlabeled parallel corpora. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies.
Xiaojun Wan. 2008. Using bilingual knowledge and ensemble techniques for unsupervised Chinese sentiment analysis. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing.
Xiaojun Wan. 2009. Co-training for cross-lingual sentiment classification. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP.
Theresa Wilson and Janyce Wiebe. 2003. Annotating opinions in the world press. In Proceedings of the 4th ACL SIGdial Workshop on Discourse and Dialogue.