An Overview of Opinionated Tasks and Corpus Preparation

An Overview of Opinionated Tasks and Corpus PreparationHsin-Hsi ChenDepartment of Computer Science and Information EngineeringNational Taiwan UniversityTaipei, Taiwan

http://research.nii.ac.jp/ntcir/ntcir-ws6/opinion/ntcir5-opinionws-en.html

What is an opinion?Opinion is a subjective informationOpinion usually contains an opinion holder an attitude, and a target, but not obligatoryA sentential clause or a meaningful unit (in Chinese) is the smallest unit of an opinion.

Why opinion processing is important?There is explosive information on the Internet, and its hard to extract opinions by humans.Opinions of the public is an important index of companies and the government.Opinions change over time, so to keep track of opinions automatically is an important issue.

Fact-based vs. Opinion-basedExamples:Circular vs. HappyHe is an engineer. vs. He thinks that his boss is a kind person.Why the sky is blue? vs. Do people support the government?

Previous Work (1)English:Sentiment words (Wiebe et al., Kim and Hovy, Takamura et al.)Opinion sentence extraction (Riloff and Wiebe, Kim and Hovy)Opinion document extraction (Wiebe et al., Pang et al.)Opinion summarization: reviews and products (Hu and Liu, Dave et al.)

Previous Work (2)JapaneseOpinion extraction (Kobayasi et al.: reviews, at word/sentence level)Opinion summarization (Morinaga et al.: product reputations, Seki, Eguchi, and Kando)ChineseOpinion extraction (Ku, Wu, Li and Chen)Opinion summarization (Ku, Li, Wu and Chen)News and Blog Corpora (Ku, Liang and Chen)Korean?

Corpus Preparation (1)QuantityHow much materials should we collect?Words/Sentences/DocumentsSourceWhat source should we pick? Mining opinions from general documents or the obvious opinionated documents? (ex. Discussion group)News, Reviews, Blogs,

Corpus Preparation (2)Different granularityWord levelSentence levelClause levelDocument levelMulti-documents (summarization)Different sourcesDifferent languages

Previous Work (Corpus Preparation 1/5)Example: NRRC Summer Workshop on Multiple-Perspective QAPeople involved: 1 researcher, 3 graduate students, 6 professorsCollect 270,000 documents, over 11-month periods, retrieve documents relevant to 8 topics, more than 200 documents of each topicWorkshop: MPQA: Multi-Perspective Question Answering RRC Host: Northeast Regional Research Center (NRRC) 2002 Leader: Prof. Janyce Wiebe Participants: Eric Breck, Chris Buckley, Claire Cardie, Paul Davis, Bruce Fraser, Diane Litman, David Pierce, Ellen Riloff, Theresa Wilson

Previous Work (Corpus Preparation 2/5) Source: news documents (World News Connection - WNC)

In another work on word level: 2,615 words

Previous Work (Corpus Preparation 3/5) Example: Using NTCIR Corpus (Chinese)ReusableNTCIR2, news documentsRetrieve documents relevant to 6 topicsOn average, 34 documents for each topicAt Word level: 838 wordsExperiments using NTCIR3 are ongoing

Previous Work(Corpus Preparation 4/5)

Previous Work(Corpus Preparation 5/5)Example: Using reviews from Web (Japanese)Specific domains: cars and games15,000 reviews (230,000 sentences) for cars, 9,700 reviews (90,000 sentences) for gamesUsing topic words (ex. Companies of cars and games)Semi-automatic methods for collecting opinion terms (with patterns)

Corpus Annotation Annotation types (1)Support/Non-supportSentiment/Non-sentimentPositive/Neutral/NegativeStrong/Medium/Weak Annotation types (2)Opinion holder/Attitude/TargetNested opinions

Previous Work (Corpus Annotation 1/4)Example: NRRC Summer Workshop on Multiple-Perspective QA (English)Total 114 documents annotated57 with deep annotations, 57 with shallow annotations7 annotators

Previous Work (Corpus Annotation 2/4)TagsOpinion: on=implicit/formally declaredFact: onlyfactive=yes/noSubjectivity: strength=high/medium/loAttitude: neg-attitude/pos-attitudeWriter: opinion holder information

Previous Work (Corpus Annotation 3/4)Example: Using NTCIR Corpus (Chinese)Total 204 documents are annotated3 annotatorsUsing XML-style tagsDefine types, but no strength (considering the agreement issue)

Previous Work (Corpus Annotation 4/4)

Corpus Evaluation (1)How to choose materials? Filter out candidates whose annotations are too diverse among annotators? (Agreements?)How many annotators are needed for one candidate? (More annotators, lower agreements)How to build the gold standard?VotingUse instances with consistent annotations

Corpus Evaluation (2)How to evaluate a corpus for a subjective task?Agreement (Is it enough?)Kappa value (To what agreement level ?)Almost perfect agreementSubstantial agreementModerate agreementFair agreementSlight agreementLess than change agreement

Kappa coefficient (wiki)Cohen's kappa coefficient is a statistical measure of inter-rater agreement.It is generally thought to be a more robust measure than simple percent agreement calculation since takes into account the agreement occurring by chance.Cohen's kappa measures the agreement between two raters who each classify N items into C mutually exclusive categories.The first evidence of Cohen's Kappa in print can be attributed to Galton (1892).

Kappa coefficient (wiki)The equation for is:

Pr(a) is the relative observed agreement among ratersPr(e) is the hypothetical probability of chance agreement If the raters are in complete agreement then = 1If there is no agreement among the raters (other than what would be expected by chance) then 0.

Kappa coefficientTwo raters are asked to classify objects into categories 1 and 2. The table below contains cell probabilities for a 2 by 2 table.

P0=P11+P22, observed level of agreementThis value needs to be compared to the value that you would expect if the two raters were totally independentPe=P1P1+P2P2http://www.childrensmercy.org/stats/definitions/kappa.htm

ExampleHypothetical Example: 29 patients are examined by two independent doctors (see Table). 'Yes' denotes the patient is diagnosed with disease X by a doctor. 'No' denotes the patient is classified as no disease X by a doctor.

P0=P11+P22=(10 + 12)/29 = 0.76Pe=P1P1+P2P2 =0.586 * 0.345 + 0.655 * 0.414 = 0.474Kappa = (0.76 - 0.474)/(1 - 0.474) = 0.54http://www.dmi.columbia.edu/homepages/chuangj/kappa/

Online Kappa Calculatorhttp://justus.randolph.name/kappa

Previous WorkCorpus EvaluationDifferent languages/annotations may have different agreements.Kappa: 0.32-0.65 (only factivity, English)Kappa: 0.40-0.68 (word level, Chinese)Different annotators with different background may have different agreements.

What are needed for this work?What kind of documents? News? Others?All relevant documents?Provide only the type of documents, or fully annotated documents for training?Provide some sentiment words as clues?To what granularity? Word, clause, sentence, document, or multi-document?In which language? Mono-lingual, multi-lingual or cross-lingual?

Natural Language Processing Lecture 15

Opinionated ApplicationsHsin-Hsi ChenDepartment of Computer Science and Information EngineeringNational Taiwan UniversityTaipei, Taiwan

Opinionated ApplicationsOpinion extraction Sentiment word miningOpinionated sentence extractionOpinionated document extractionOpinion summarizationOpinion tracking

Opinionated question answeringMulti-lingual/Cross-lingual opinionated issues

Opinion MiningOpinion extraction identifies opinion holders, extracts the relevant opinion sentences and decides their polarity.Opinion summarization recognizes the major events embedded in documents and summarizes the supportive and the non-supportive evidence.Opinion tracking captures subjective information from various genres and monitors the developments of opinions from spatial and temporal dimensions.

Opinion extraction Extracting opinion evidence from words, sentences, and documents, and then to tell their polarities.The composition of semantics and that of opinions are very much alike in documents:Word -> Sentence -> DocumentThe algorithm is designed based on the composition of different granularities.

SeedsSentiment words in General Inquirer (GI) and Chinese Network Sentiment Dictionary (CNSD) are collected as seeds.GI is in English, while CNSD is in Chinese. GI is translated in Chinese.A total of 10,542 qualified seeds are collected in NTUSD.

Statistics of Seeds

Thesaurus ExpansionThe seed vocabulary is enlarged by (The Academia Sinica Bilingual Ontological WordNet)Words in the same clusters may not always have the same opinion tendency.(forgive) vs. (appease)How to distinguish words with different polarities within the same cluster/synsetOpinion tendency of a word and its strength

Sentiment Tendency of a Character (raw score)

Sentiment Tendency of a Character (normalization)?

Sentiment Tendency of a Word

A sentiment degree of a Chinese word w is the average of the sentiment scores of the composing characters c1, c2, ..., cpA positive score denotes a positive word.A negative score denotes a negative word.Score zero denotes non-sentiment or neutral.

Opinion Extraction at Sentence Level at Sentence Level?

Opinion Extraction at Document Level

Evaluation Corpus PreparationSource: TREC (English;News) / NTCIR (Chinese;News) / Blog (Chinese:Casual Writing)Corpus is prepared for multi-genre and multi- lingual issues.Corpus is prepared to evaluate opinion extraction, summarization, and tracking.

Opinion SummarizationFind important topics of a document set.Find relative sentences of important topicsFind opinions embedded in sentences.Summarize opinions of important topics.

Opinion TrackingOpinion tracking is a kind of graph-based opinion summarization.We are concerned of how opinions change over time. An opinion tracking system tells how people change their opinions as time goes by.To track opinions, opinion extraction and summarization are necessary. Opinion extraction tells the changes of opinion polarities, while opinion summarization tells the correlated events.

NTCIR (NII Test Collection for IR Systems) Project Workshop: MPQA: Multi-Perspective Question Answering RRC Host: Northeast Regional Research Center (NRRC) 2002 Leader: Prof. Janyce Wiebe Participants: Eric Breck, Chris Buckley, Claire Cardie, Paul Davis, Bruce Fraser, Diane Litman, David Pierce, Ellen Riloff, Theresa Wilson NRRC 2002

https://rrc.mitre.org/pubs.shtml

SentimentInquirer ; Genre

An Overview of Opinionated Tasks and Corpus Preparation

Documents

Transcript of An Overview of Opinionated Tasks and Corpus Preparation