An Investigation of Implicatures in Chinese Lingjia Deng, Janyce Wiebe Intelligent Systems Program...

An Investigation ofImplicatures in ChineseLingjia Deng, Janyce Wiebe

Intelligent Systems ProgramDepartment of Computer ScienceUniversity of Pittsburgh

Outline Introduction

Implicature in Chinese

Inference in Chinese

Extracting Chinese GoodFor/BadFor Chinese GoodFor/BadFor Words Syntax of Chinese Agents/Objects Chinese Sentiment Analysis

Conclusions

OutlineIntroduction




Conclusions

IntroductionScenario:

The government proposes the bill of Affordable Care Act. We want to analyze everyone’s opinion of it.

We can collect opinions by doing survey, questionnaire, etc.

We can also collect the writer’s stances by analyzing their posts online.

Introduction “The bill will lower the skyrocketing healthcare

costs.”

Explicit (Direct) Sentiment: writer negative toward the skyrocketing healthcare

costs The healthcare cost is too high. I cannot afford it.

Implicit (Inferred) Sentiment: writer positive toward the bill will lower costs

There is a chance that the costs could be decreased! I love it!

writer positive toward the bill The bill is able to do this! I’ll vote for it!

WHAT ABOUT BILL?

GoodFor/BadFor Event“The bill will lower the skyrocketing healthcare

costs.” <bill, lower, healthcare costs>

GoodFor/BadFor Event (Deng et al., ACL 2013 short): goodFor event: help, increase, etc badFor event: lower, destroy, decrease, etc <agent, goodFor/badFor event, object>

GoodFor/BadFor Corpus (Deng et al., ACL 2013 short): 134 political editorials e.g. <bill, lower, healthcare costs> e.g. <positive, badFor, negative> almost 20% sentences have clear goodFor/badFor events available at mpqa.cs.pitt.edu

Benefactive/Malefactive Event

Related Work Words/Phrases directly imply implicit opinions. (Zhang and

Liu, 2011; Feng et al., 2013)

Infer an overall polarity of a sentence by compositional semantics. (Choi and Cardie, 2008; Moilanen et al., 2010)

Identify classes of goodFor/badFor terms, and carry out studies involving artificially constructed goodFor/badFor triples and corpus examples matching fixed linguistic templates. (Anand and Reschke 2010; 2011)

Generate a lexicon of patient polarity verbs, which correspond to goodFor/badFor events whose spans are verbs. (Goyal et al., 2012)

Investigate sarcasm where the writer holds a positive sentiment toward a negative situation. (Riloff et al., 2013)

Our Work of GoodFor/BadFor An annotated goodFor/badFor Corpus. (Deng et al., ACL

2013 short)

A sense-level goodFor/badFor lexicon. (Choi et al., WASSA 2014)

Four inference rule schemas and a graph-based model for sentiment propagation. (Deng and Wiebe, EACL 2014)

An optimization framework for joint sentiment inference and disambiguating goodFor/badFor components. (Deng et al., Coling 2014)

A rule-based framework for representing and analyzing opinion implicatures. (Wiebe and Deng, arXiv 2014; WASSA 2014)

TODAY

TODAY

Motivation For This Work This work is investigation of implicatures in

Chinese.

People speaking different languages may express their opinions in different ways.

Before directly applying goodFor/badFor implicature in English to Chinese, we want to investigate: whether such implicature also exists in Chinese; whether the sentiment inference rules also apply to

Chinese implicit opinions; whether it is feasible to extract goodFor/badFor events

and the corresponding components in Chinese.

OutlineMotivation

Implicature in Chinese Agreement Study



Conclusions

Implicature in Chinese:Agreement Study An opinion-orientated, paragraph-paralleled

corpus: Chinese version of the New York Times (http://cn.nytimes.com/).

Select the English paragraphs containing English goodFor/badFor words.

Present the parallel Chinese paragraphs.

http://cn.nytimes.com/

Implicature in Chinese:Agreement StudyAll the three annotators, including me, are

Chinese graduate students in University of Pittsburgh.

Annotate 60 paragraphs, 253 sentences.

Conduct the agreement study in the same manner with (Deng et al., 2013).

Implicature in Chinese:Agreement StudyTrain with English manual (Deng et al., 2013)

and several Chinese annotated examples.

Annotate: (A). spans of the goodFor/badFor events (B). spans of the agents and objects of the events (C). polarities of the events: goodFor or badFor (D). writer’s sentiments toward the agents and

objects: positive, negative, neutral

Evaluate by the same metrics as (Deng et al., 2013): for (A) & (B): percentage of span both annotate for (C) & (D): kappa

Implicature in Chinese:Agreement Study

overlap(a,b)

(A) goodFor/badFo

r span

(B) agent span (B) object span

Anno 1&2 0.7929 0.9091 0.9091

Anno 1&3 0.7044 0.9524 1.0

kappa (C) goodFor/badFo

r polarity

(D) sentiment toward agent

(D) sentiment toward object

Anno 1&2 0.9385 0.7830 0.7238

Anno 1&3 0.8966 0.5913 0.8478

All the scores are good: trained by the English manual, the annotators are able to detect similar implicature in Chinese.

Scores of (A) and (D) are lower than those in the English goodFor/badFor agreement study （ Deng et al., 2014） .

Implicature in Chinese:Agreement Study For annotating (D) writer’s sentiments, the main

disagreement comes from: Anno 1 annotated as positive or negative Anno 2 annotated as neutral

We conduct a phase-II agreement study on 10 editorials from the English corpus (Deng et al., 2013). Three scores:

I. agreement scores in Chinese by three annotators II. agreement scores in English by three annotators III. previous agreement scores (Deng et al., 2013)

score I = score II; score I < score III; score II < score III They have a similar understanding of implicatures in

the two languages.

Implicature in Chinese:Agreement StudyFor annotating (A) goodFor/badFor events, the

major disagreement comes from: Anno 1 marks a goodFor/badFor span Anno 2 doesn’t mark it because he thinks it

violates the syntax rules we specified in the English manual.

Syntax rules are specified in the English manual to guide the annotators to focus on clear cases of goodFor/badFor events, e.g. The object should be the major semantic object. The goodFor/badFor polarity should be perceived

within the triple.

GoodFor/BadFor Cases Evoked by Chinese SyntaxThe goodFor/badFor polarity should be

perceived within the triple.

It will put the reform to die.

In English: this is NOT annotated as a goodFor/badFor event. put is the verb <it, put, reform>

put X to die: badFor X put X to revive: goodFor X

GoodFor/BadFor Cases Evoked by Chinese SyntaxThe goodFor/badFor polarity should be

perceived within the triple.

It will put the reform to die.

这将把改革置于死地。

In Chinese: this can be represented as a clear goodFor/badFor case “put” is not a verb in the Chinese sentence BA structure (Chao, 1968; Li and Thompson, 1989;

Sybesma, 1992)

subject, BA, object, verb it will BA kill the reform

Implicature in Chinese:Conclusion Such syntax is commonly seen in Chinese.

These goodFor/badFor events due to the Chinese syntax are clear enough in Chinese. It will kill the reform.

In order to fully study the Chinese goodFor/badFor, the manual should be revised to provide guidance to annotate such events.

Overall, similar implicatures can be perceived in English and in Chinese.

Outline Motivation


Inference in Chinese Graph Model for Sentiment Propagation (Deng

and Wiebe, 2014)


Conclusions

agentgoodFor/badForobject

EXPLICITSENTIMENTDETECTOR

LOOPYBELIEF

PROPAGATION

agent/object

goodFor

badFor

Graph Model (Deng and Wiebe, 2014)

EncodingInference

Rules

Inference in Chinese: Graph Model PerformanceWe run an isolated evaluation of the graph

model itself (Deng and Wiebe, 2014).

For a node, calculate how many times it is propagated correctly given any neighbor node being assigned with a correct sentiment label.

The scores in Chinese are lower than those in English (89% in (Deng and Wiebe, 2014)). Blocked Inference

Dataset # subgraph

correctness

all subgraph 136 0.7058

multi-node subgraph

61 0.8251

Blocked Inference:In Chinese and English…a misreading which estimated the law would

“reduce the amount of labor …

<law, reduce, labor>

The writer doesn’t believe <law, reduce, labor>. “misreading” believes so.

The writer is negative toward “misreading”.

For events which the writer doesn’t believe it is true, the inference should be blocked. It is not in the writer’s belief space (Wiebe and Deng,

2014).

BADFOR

Inference in Chinese: ConclusionThough there are cases where the inference

rules are blocked, The cases appear both in Chinese and in English. We didn’t find evidence showing that the blocked

inference only occurs in English.

Besides the blocked inferences, the good correctness scores provide evidence that the inference rules also apply to Chinese.

Outline Motivation



Extract Chinese GoodFor/BadFor Chinese GoodFor/BadFor Words Syntax of Chinese Agents/Objects Chinese Sentiment Analysis

English + Parallel Corpus?

Conclusions

Chinese GoodFor/BadFor Words Given we have an English goodFor/badFor lexicon (Choi et

al., 2014), is it applicable to derive a bilingual goodFor/badFor lexicon from a parallel corpus?

We manually find the parallel spans in English corresponding to the annotated goodFor/badFor spans in the Chinese.

76.25% annotated Chinese goodFor/badFor spans have parallel goodFor/badFor spans in English.

For the other Chinese annotated goodFor/badFor spans, there is no corresponding goodFor/badFor span in English, due to: Chinese syntax; paraphrasing.

Chinese Agent/Object We use the Stanford dependency parser to extract the

agent/object in English (Deng et al., 2014). nsubj-(event, agent) dobj-(event, object)

Can we use the same dependency labels to extract agent/object in Chinese?

We choose the Chinese Stanford dependency parser. Some dependency labels exist both in Chinese and English.

There are more nsubj and dobj in Chinese data than in English data.

Some labels are especially designed for Chinese (Chang et al., 2009). 19.57% in agents, 25.82% in objects. They are similar to some labels in English.

Chinese Sentiment AnalysisSentiment Lexicon:

HowNet NTU Sentiment Dictionary (Ku and Chen, 2007) A sentiment lexicon from Tsinghua University (Li

and Sun, 2007)

Bilingual and Multilingual Chinese Sentiment Analysis Research Wan, 2008; Wan, 2009; Boyd-Graber and Resnik,

2010; Lu et al., 2011; etc.

Chinese Sentiment Analysis Tools LingPipe http://alias-i.com/lingpipe/ Semantria https://semantria.com/

http://alias-i.com/lingpipe/

http://alias-i.com/lingpipe/

https://semantria.com/



OutlineMotivation



Extracting Chinese GoodFor/BadFor

Conclusions

Conclusions The implicatures that arise from explicit sentiment

toward goodFor/badFor events exist in Chinese language and they are similar to those in English.

The inference rules we developed for English apply to Chinese.

There are several cases where the inferences are blocked and such cases exist both in Chinese and English.

It is promising to develop systems automatically extracting Chinese goodFor/badFor events using the existing methods for English and leveraging the parallel corpus.

Questions ?Thank Fan Zhang and Changsheng Liu for

annotations. Part of References:

Jordan Boyd-Graber and Philip Resnik. 2010. Holis- tic sentiment analysis across languages: Multilingual supervised latent dirichlet allocation. In Proceedings of the 2010 Conference on Empirical Meth- ods in Natural Language Processing.

Lingjia Deng and Janyce Wiebe. 2014. Sentiment propagation via implicature constraints. In Meeting of the European Chapter of the Association for Computational Linguistics.

Lingjia Deng, Yoonjung Choi, and Janyce Wiebe. 2013. Benefactive/malefactive event and writer attitude annotation. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics.

Lun-Wei Ku and Hsin-Hsi Chen. 2007. Mining opinions from the web: Beyond relevance retrieval. Journal of the American Society for Information Science and Technology.

Jun Li and Maosong Sun. 2007. Experimental study on sentiment classification of chinese review using machine learning techniques. In Natural Language Processing and Knowledge Engineering, 2007.

Bin Lu, Chenhao Tan, Claire Cardie, and Benjamin K Tsou. 2011. Joint bilingual sentiment classification with unlabeled parallel corpora. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies.

Xiaojun Wan. 2008. Using bilingual knowledge and ensemble techniques for unsupervised Chinese sentiment analysis. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing.

Xiaojun Wan. 2009. Co-training for cross-lingual sentiment classification. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP.

Theresa Wilson and Janyce Wiebe. 2003. Annotating opinions in the world press. In Proceedings of the 4th ACL SIGdial Workshop on Discourse and Dialogue.

An Investigation of Implicatures in Chinese Lingjia Deng, Janyce Wiebe Intelligent Systems Program...

Documents

Transcript of An Investigation of Implicatures in Chinese Lingjia Deng, Janyce Wiebe Intelligent Systems Program...