Sentiment Analysis + extras - courses.cs.ut.ee · •Using NLP/statistics/machine learning methods...

Sentiment Analysis + extrasNatural Language Processing: Lecture 10

09.11.2017

Kairit Sirts

Homework 2 – POS taggingScore Name Features

97.3 Faiz word +/-2 words; prefixes, suffixes; numeric, cap, all caps, first and last word, hyphen, lower

96.6 Maksym word and stem +/-1 stems; prefixes, suffixes, last 2 words; cap, P and N punct; word pos, sent len

96.5 Olha word/stem -2 words/stems, +1 word/stem; prefixes, suffixes

95.4 Ian word and stem -2 words/stems; previous POS tag

94.5 Andre stem +1 word -2 words; prefixes, suffixes; cap, punct

94.1 Alina word -1 word, stem -2 stems; prefixes, suffixes; contains num, punct; cap, is punct, is num

93.9 Liza word; prefixes, suffixes; num, first, last, 2nd last, alpha, cap, all caps, punct, article (a, an, the)

93.4 Viacheslav stem; prefixes, suffixes; characters (by frequency)

93.2 Aytaj word -2 words; prefixes, suffixes; cap, all caps, lower, contains .

92.9 Vladyslav word/stem -2 words/stems; prefixes, suffixes; cap, punct; word length

89.5 Artem stem

89.2 Yevhen, Ivan, Hendrik stem

79.4 Yurii word, stem; bigram of word and previous word 2

Topics

• Sentiment analysis

• Sentiment lexicons

• Extras• Topic modeling• More about evaluation

• Cross-validation• Precision and recall in multiclass setting

• Slides partially based on:• https://web.stanford.edu/~jurafsky/slp3/slides/7_NB.pdf• https://lct-master.org/files/MullenSentimentCourseSlides.pdf• https://web.stanford.edu/~jurafsky/slp3/slides/21_SentLex.pdf

3

https://web.stanford.edu/~jurafsky/slp3/slides/7_NB.pdf

https://lct-master.org/files/MullenSentimentCourseSlides.pdf

https://web.stanford.edu/~jurafsky/slp3/slides/21_SentLex.pdf

Sentiment Analysis

4

Positive or negative movie review

• unbelievably disappointing

• Full of zany characters and richly applied satire, and some great plot twists

• this is the greatest screwball comedy ever filmed

• It was pathetic. The worst part about it was the boxing scenes.

5

What is sentiment?

• Sentiment = feelings• Attitudes

• Emotions

• Opinions

• Subjective impressions, not facts

6

What is sentiment?

• Generally, a binary opposition in opinions is assumed

• For/against, like/dislike, good/bad etc

• Some sentiment analysis jargon:• “Semantic orientation”

• “Polarity”

7

What is sentiment analysis

• Using NLP/statistics/machine learning methods to extract, identify or otherwise characterize the sentiment content of a text unit

• Also referred to as:• Opinion extraction

• Opinion mining

• Sentiment mining

• Subjectivity analysis

8

Why sentiment analysis?

• Movie: is this review positive or negative?

• Products: what do people think about the new iPhone?

• Public sentiment: how is consumer confidence? Is despair increasing?

• Politics: what do people think about this candidate or issue?

• Prediction: predict election outcomes or market trends from sentiment

• Customer service: is this customer email satisfied or dissatisfied?

• Marketing: how are people responding to this ad/campaign/product release/news item?

9

Scherer Typology of Affective States

• Emotion: brief organically synchronized … evaluation of a major event

• angry, sad, joyful, fearful, ashamed, proud, elated

• Mood: diffuse non-caused low-intensity long-duration change in subjective feeling

• cheerful, gloomy, irritable, listless, depressed, buoyant

• Interpersonal stances: affective stance toward another person in a specific interaction

• friendly, flirtatious, distant, cold, warm, supportive, contemptuous

• Attitudes: enduring, affectively colored beliefs, dispositions towards objects or persons

• liking, loving, hating, valuing, desiring

• Personality traits: stable personality dispositions and typical behavior tendencies

• nervous, anxious, reckless, morose, hostile, jealous

10

Scherer Typology of Affective States

• Emotion: brief organically synchronized … evaluation of a major event

• angry, sad, joyful, fearful, ashamed, proud, elated

• Mood: diffuse non-caused low-intensity long-duration change in subjective feeling

• cheerful, gloomy, irritable, listless, depressed, buoyant

• Interpersonal stances: affective stance toward another person in a specific interaction

• friendly, flirtatious, distant, cold, warm, supportive, contemptuous

• Attitudes: enduring, affectively colored beliefs, dispositions towards objects or persons

• liking, loving, hating, valuing, desiring

• Personality traits: stable personality dispositions and typical behavior tendencies

• nervous, anxious, reckless, morose, hostile, jealous

11

What is sentiment analysis?

• Sentiment analysis is the detection of attitudes“enduring, affectively colored beliefs, dispositions towards objects or persons”

1. Holder (source) of attitude

2. Target (aspect) of attitude

3. Type of attitude• From a set of types

• Like, love, hate, value, desire, etc.

• Or (more commonly) simple weighted polarity: • positive, negative, neutral, together with strength

4. Text containing the attitude• Sentence or entire document

12

Sentiment analysis

• Simplest task:• Is the attitude of this text positive or negative?

• More complex:• Rank the attitude of this text from 1 to 5

• Advanced:• Detect the target, source, or complex attitude types

13

Other related tasks

• Information extraction (discarding subjective information)

• Question answering (recognising opinion-oriented questions)

• Summarisation (accounting for multiple viewpoints)

• “Flame” detection

• Identifying child-suitability of videos based on comments

• Bias identification in news sources, fake news

• Identifying (in)appropriate content for ad placement

14

Applications in Business Intelligence

• Question: “Why aren’t consumers buying our laptop?”

• We know the concrete data: price, specs, competition etc

• We want to know subjective data: • “the design is tacky”

• “customer service was condescending”

• Misperceptions are also important, e.g. “updated drivers aren’t available” (even though they really are)

15

Challenges in Sentiment Analysis

• People express opinions in complex ways

• In opinion texts, lexical content alone can be misleading

• Negations and topic changes are common, both in the text and sentence level

• Rhetorical devices such as sarcasm, irony, implication etc

16

A letter to a hardware store

“Dear <hardware store>

Yesterday I had occasion to visit <your competitor>. The had an excellent selection, friendly and helpful salespeople, and the lowest prices in town.

You guys suck.

Sincerely,”

17

Amazon.com 5 star

“The characters are so real and handled so carefully, that being trapped inside the Overlook is no longer just a freaky experience. You run along with them, filled with dread, from all the horrible personifications of evil inside the hotel's awful walls. There were several times where I actually dropped the book and was too scared to pick it back up. Intellectually, you know it's not real. It's just a bunch of letters and words grouped together on pages. Still, whenever I go into the bathroom late at night, I have to pull back the shower curtain just to make sure.”

18

Amazon.com 1 star

“The original Star Wars trilogy was a defining part of my childhood. Born as I was in 1971, I was just the right age to fall headlong into this amazing new world Lucas created. I was one of those kids that showed up early at toy stores [...] anxiously awaiting each subsequent installment of the series. I'm so glad that by my late 20s, the old thrill had faded, or else I would have been EXTREMELY upset over Episode I: The Phantom Menace... perhaps the biggest let-down in film history.”

19

Pitchfork.com (0.0 out of 10)

“Ten years on from Exile, Liz has finally managed to achieve what seems to have been her goal ever since the possibility of commercial success first presented itself to her: to release an album that could have just as easily been made by anybody else.”

20

Amazon.com 1 star

“It took a couple of goes to get into it, but once the story hooked me, I found it difficult to put the book down -- except for those moments when I had to stop and shriek at my friends, "SPARKLY VAMPIRES!" or "VAMPIRE BASEBALL!" or "WHY IS BELLA SO STUPID?" These moments came increasingly often as I reached the climactic chapters, until I simply reached the point where I had to stop and flail around laughing.”

21

What to classify

• There are many possibilities of what we might want to classify• Users

• Texts

• Sentences / paragraphs / chunks of texts

• Predetermined descriptive phrases• <ADJ NOUN>, <NOUN NOUN>, <ADV ADJ>, etc

• Words

• Tweets

22

SA as document classification

• Extract features from text• Ngrams

• POS tags, parse structures

• Negation words, emoticons, exclamation marks etc

• Distributed features


• Build a classifier• Naïve Bayes

• Logistic regression

• SVM

23

SA as document classification

• Extract features from text• Ngrams

• POS tags, parse structures

• Negation words, emoticons, exclamation marks etc

• Distributed features


• Build a classifier• Naïve Bayes

• Logistic regression

• SVM

24

Sentiment Lexicons

25

Polarity keywords

• There seems to be some relation between positive words and positive reviews

• Can we come up with a set of keywords by hand to identify polarity?

• Results from (Pand et al., 2002)

26

Proposed word list Accuracy Ties

Human 1 Pos: dazzling, brilliant, phenomenal, excellent, fantasticNeg: suck, terrible, awful, unwatchable, hideous

58% 75%

Human 2 Pos: gripping, mesmerizing, riveting, spectacular, cool, awesome, thrilling, badass, excellent, moving, excitingNeg: : bad, cliched, sucks, boring, stupid, slow

64% 39%

Human 3 + stats

Pos: love, wonderful, best, great, superb, still, beautifulNeg: bad, worst, stupid, waste, boring, ?, !

69% 16%

Sentiment/Affective Lexicons

• The General Inquirer

• LIWC - Linguistic Inquiry and Word Count

• MPQA Subjectivity Cues Lexicon

• Bing Liu Opinion Lexicon

• SentiWordNet

27

The General InquirerP. J. Stone et al., 1966. The General Inquirer: A Computer Approach to Content Analysis.

• Home page: http://www.wjh.harvard.edu/~inquirer

• List of Categories: http://www.wjh.harvard.edu/~inquirer/homecat.htm

• Spreadsheet: http://www.wjh.harvard.edu/~inquirer/inquirerbasic.xls

• Categories:• Positiv (1915 words) and Negativ (2291 words)

• Strong vs Weak, Active vs Passive, Overstated versus Understated

• Pleasure, Pain, Virtue, Vice, Motivation, Cognitive Orientation, etc

• Free for Research Use

28

http://www.wjh.harvard.edu/~inquirer

http://www.wjh.harvard.edu/~inquirer/homecat.htm

http://www.wjh.harvard.edu/~inquirer/inquirerbasic.xls

LIWC - Linguistic Inquiry and Word CountPennebaker et al., 2007. Linguistic Inquiry and Word Count: LIWC

• Home page: http://www.liwc.net/

• 2300 words, >70 classes

• Affective Processes• negative emotion (bad, weird, hate, problem, tough)

• positive emotion (love, nice, sweet)

• Cognitive Processes• Tentative (maybe, perhaps, guess), Inhibition (block, constraint)

• Pronouns, Negation (no, never), Quantifiers (few, many)

• Academic version $90, 30 day rental 10$

29

http://www.liwc.net/

MPQA Subjectivity Cues LexiconT. Wilson et al., 2005. Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis. Riloff and Wiebe, 2003. Learning extraction patterns for subjective expressions.

• Home page: http://mpqa.cs.pitt.edu/lexicons/subj_lexicon/

• 6885 words• 2718 positive

• 4912 negative

• Each word annotated for intensity (strong, weak)

• GNU GPL

30

http://mpqa.cs.pitt.edu/lexicons/subj_lexicon/

Bing Liu Opinion LexiconM. Hu and B. Liu, 2004. Mining and Summarizing Customer Reviews.

• Bing Liu's Page on Opinion Mining

• http://www.cs.uic.edu/~liub/FBS/opinion-lexicon-English.rar

• 6786 words• 2006 positive

• 4783 negative

31

https://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html

http://www.cs.uic.edu/~liub/FBS/opinion-lexicon-English.rar

SentiWordNetS. Baccianella et al., 2010. SENTIWORDNET 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining.

• Home page: http://sentiwordnet.isti.cnr.it/

• All WordNet synsets automatically annotated for degrees of positivity, negativity, and neutrality/objectiveness

• [estimable(J,3)] “may be computed or estimated” • Pos 0 Neg 0 Obj 1

• [estimable(J,1)] “deserving of respect or high regard” • Pos .75 Neg 0 Obj .25

32

http://sentiwordnet.isti.cnr.it/

Semi-supervised Sentiment Lexicon Extraction

• Use a small amount of information• A few labeled examples

• A few hand-built patterns

• Then bootstrap a lexicon

33

Hatzivassiloglou and McKeown intuition for identifying word polarityV. Hatzivassiloglou and K. R. McKeown, 1997. Predicting the Semantic Orientation of Adjectives.

• Adjectives conjoined by “and” have same polarity• Fair and legitimate, corrupt and brutal

• *fair and brutal, *corrupt and legitimate

• Adjectives conjoined by “but” do not• fair but brutal

34

Hatzivassiloglou and McKeown 1997: Step 1

• Label seed set of 1336 adjectives (all >20 in 21 million word WSJ corpus)

• 657 positive: adequate central clever famous intelligent remarkable reputed sensitive slender thriving…

• 679 negative: contagious drunken ignorant lanky listless primitive strident troublesome unresolved unsuspecting…

35


• Expand seed set to conjoined adjectives

36

nice, helpful

nice, classy


• Supervised classifier assigns “polarity similarity” to each word pair, resulting in graph:

37


• Clustering for partitioning the graph into two

38

Output Polarity Lexicon

• Positive• bold decisive disturbing generous good honest important large mature

patient peaceful positive proud sound stimulating straightforward strange talented vigorous witty…

• Negative• ambiguous cautious cynical evasive harmful hypocritical inefficient insecure

irrational irresponsible minor outspoken pleasant reckless risky selfish tedious unsupported vulnerable wasteful…

39

Output Polarity Lexicon

• Positive• bold decisive disturbing generous good honest important large mature

patient peaceful positive proud sound stimulating straightforward strangetalented vigorous witty…

• Negative• ambiguous cautious cynical evasive harmful hypocritical inefficient insecure

irrational irresponsible minor outspoken pleasant reckless risky selfish tedious unsupported vulnerable wasteful…

40

Turney AlgorithmTurney, 2002. Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews.

1. Extract a phrasal lexicon from reviews

2. Learn polarity of each phrase

3. Rate a review by the average polarity of its phrases

41

Extract two-word phrases with adjectives and adverbs

First word Second word Third word (not extracted)

JJ NN or NNS anything

RB, RBR, RBS JJ Not NN nor NNS

JJ JJ Not NN or NNS

NN or NNS JJ Not NN or NNS

RB, RBR, RBS VB, VBD, VBN, VBG anything

42

How to measure polarity of a phrase?

• Positive phrases co-occur more with “excellent”

• Negative phrases co-occur more with “poor”

• But how to measure co-occurrence?

43

PMI - Pointwise Mutual Information

• How much more do events x and y co-occur than if they were independent?

• PMI between words: How much more do two words co-occur than if they were independent?

44

How to estimate Pointwise Mutual Information?• Query search engine

• Turney used Altavista because it had NEAR operator

• P(word) estimated by hits(word)

• P(word1, word2) estimated by hits(word1 NEAR word2)

45

Does word occur more with “poor” or “excellent”?

46

Phrases from a positive review

47

Phrases from a negative review

48

Results of Turney algorithm

• 410 reviews from Epinions• 170 (41%) negative

• 240 (59%) positive

• Majority class baseline: 59%

• Turney algorithm: 74%

• Phrases rather than words

• Learns domain-specific information

49

Using WordNet to learn polarityS.M. Kim and E. Hovy. 2004. Determining the sentiment of opinions.M. Hu and B. Liu. Mining and summarizing customer reviews.

• WordNet: online thesuarus

• Create positive (“good”) and negative seed-words (“terrible”)

• Find Synonyms and Antonyms• Positive Set: Add synonyms of positive words (“well”) and antonyms of

negative words

• Negative Set: Add synonyms of negative words (“awful”) and antonyms of positive words (”evil”)

• Repeat, following chains of synonyms

• Filter

50

Summary on semi-supervised lexicon learning

• Advantages:• Can be domain-specific• Can be more robust (more words)

Intuition

• Start with a seed set of words (‘good’, ‘poor’)

• Find other words that have similar polarity:• Using “and” and “but”• Using words that occur nearby in the same document• Using WordNet synonyms and antonyms

51

Using lexicon to detect document sentimentSimplest unsupervised method• Count the words with positive sentiment from a document

• Count the words with the negative sentiment

• Choose whichever value (positive or negative) has higher sum

52

Using lexicon to detect document sentimentSimplest supervised method• Build a classifier

• Predict sentiment (or emotion, or personality) given features

• Use “counts of lexicon categories” as a features

• Sample features:• LIWC category “cognition” had count of 7• NRC Emotion category “anticipation” had count of 2

• Baseline• Instead use counts of all the words and bigrams in the training set• This is hard to beat• But only works if the training and test sets are very similar

53

Conclusion on Sentiment Analysis

• Sentiment analysis is about detecting the polarity of subjective opinions

• Related tasks, all related to subjective aspects:• Emotion detection, emotion intensity detection

• Irony, sarcasm, humor detection

• Personality analysis

• In the simplest case SA is a document classification task, trying to predict the binary sentiment: positive or negative

• Task or domain-specific sentiment lexicons are useful in SA

54

LDA topic modeling

55

LDA topic modeling

• LDA – Latent Dirichlet Allocation

• Unsupervised document clustering into topics or themes

• Each document will be represented as a mixture of K topics

• The topic vectors can be used as document representations in further classification task

56

57David Blei, 2012. Probabilistic Topic Models

Intuition of LDA

• Each topic is a distribution over words in a fixed vocabulary• The genetics topic has words about genetics with high probability

• The evolutionary biology topic has words about evolutionary biology with high probability

• Each document has a distribution over topics• In a two topic situation (genetics and evolutionary biology) a document may

be represented for instance with a vector [0.7, 0.3]

• This is a mixture of topics

58

Generative story

LDA is a generative model, thus it has a generative story

• For each topic, generate a distribution over words

• For each document, generate a distribution over topics

• For each word position in the document:• Sample a topic from the document’s topic distribution

• Sample a word from that topic distribution

59

LDA model – plate diagram

60

Topic-word distributions

hyperparameters

wordsTopic index for each word

Document-topicdistributions

David Blei, 2012. Probabilistic Topic Models

Training an LDA model

• Learn the topic-word distributions

• For each document find the document-topic distributions

• Training equals inference because the model is unsupervised• By training the model on a text corpus we also obtain the topic vectors for

these documents

Two main methods (both are beyond the scope of our course)

• Gibbs sampling

• Variational Bayes

61

LDA fit to 17000 articles from Science

62David Blei, 2012. Probabilistic Topic Models

Tools for training topic models

• Gensim

• Mallet

• Stanford Topic Modeling Toolkit

• jLDADMM

• R topic modeling package

63

https://radimrehurek.com/gensim/

http://mallet.cs.umass.edu/

https://nlp.stanford.edu/software/tmt/tmt-0.4/

https://github.com/datquocnguyen/jLDADMM

https://cran.r-project.org/web/packages/topicmodels/index.html

Evaluation

64

Evaluation protocols

• Train-dev-test: 80/10/10, 70/20/10, 70/15/15• Train on training set

• Tune hyper-parameters on development set

• Test the best model on test set

• What if we can’t afford to split the data?• Cross-validation

65

K-fold cross-validation

• Break up data into 5 folds

• For each fold:• Choose the fold as a temporary test

set

• Train on 9 folds, evaluate on test fold

• Report average performance of the 5 runs

66

TrainingTest

Test

Test

Test

Test

Training

Training Training

Training

Training

Iteration

1

2

3

4

5

Cross-validation

• Most commonly used 10-fold CV and 5-fold CV• Extreme case is LOOCV – Leave one out cross-validation

• Stratified K-fold CV – the label distribution is preserved in each fold.• If the dataset contains 10 positive items and 30 positive items and data is

divided into 5 fold then how many positive and how many negative items will be in each fold?

• The performance on different folds can vary a lot, especially when folds are small

• If possible, separate the test set and tune the hyperparameters using CV on the train set

67

Aggregating CV results

• Micro-averaging• Sum TP, FP and FN counts from all folds and then compute precision, recall

and F-score

• Macro-averaging• Compute precision, recall and F-score for each fold and then average.

68

Precision and recall in multiclass classification

• Contingency table

69Figure 6.5: https://web.stanford.edu/~jurafsky/slp3/6.pdf

Aggregating precision and recall

70Figure 6.6: https://web.stanford.edu/~jurafsky/slp3/6.pdf

Conclusion for Extras

• LDA topic modeling – unsupervised method for clustering documents based on themes or topics

• It can provide useful features for document classification

• Use cross-validation when you can’t afford to split the data into train/dev/test

• Split data into train/test and perform CV with train set to choose features and hyperparameters

• With multiple classes evaluate precision and recall for every class and then aggregate with micro- or macro-averaging

71

Sentiment Analysis + extras - courses.cs.ut.ee · •Using NLP/statistics/machine learning methods...

Documents

Transcript of Sentiment Analysis + extras - courses.cs.ut.ee · •Using NLP/statistics/machine learning methods...