SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT...
Transcript of SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT...
![Page 1: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/1.jpg)
SSentiment AAnalysis
Presented by
Aditya Joshi 08305908
Guided by
Prof. Pushpak Bhattacharyya
IIT Bombay
![Page 2: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/2.jpg)
What is SA & OM?
• Identify the orientation of opinion in a piece of text
• Can be generalized to a wider set of emotions
The movie was fabulous!
The movie stars Mr. X
The movie was horrible!
![Page 3: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/3.jpg)
Motivation
• Knowing sentiment is a very natural ability of a human being.
Can a machine be trained to do it?
• SA aims at getting sentiment-related knowledge especially from the huge amount of information on the internet
• Can be generally used to understand opinion in a set of documents
![Page 4: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/4.jpg)
Tripod of Sentiment AnalysisCognitive Science
Natural Language Processing
Machine Learning
Sentiment Analysis
Natural Language Processing
Machine Learning
![Page 5: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/5.jpg)
Contents :
Challenges
Subjectivitydetection
SAApproaches
Applications
Lexical Resources
![Page 6: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/6.jpg)
Challenges
• Contrasts with standard text-based categorization
• Domain dependent
• Sarcasm
• Thwarted expressions
Mere presence of words is Indicative of the category
in case of text categorization.
Not the case with sentiment analysis
• Contrasts with standard text-based categorization
• Domain dependent
• Sarcasm
• Thwarted expressions
Sentiment of a word is w.r.t. the
domain.
Example: ‘unpredictable’
For steering of a car,
For movie review,
• Contrasts with standard text-based categorization
• Domain dependent
• Sarcasm
• Thwarted expressions
Sarcasm uses words ofa polarity to represent
another polarity.
Example: The perfume is soamazing that I suggest you wear it
with your windows shut
• Contrasts with standard text-based categorization
• Domain dependent
• Sarcasm
• Thwarted expressions
the sentences/words that contradict the overall sentiment
of the set are in majority
Example: The actors are good, the music is brilliant and appealing.
Yet, the movie fails to strike a chord.
![Page 7: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/7.jpg)
SentiWordNet
•Lexical resource for sentiment analysis
•Built on the top of WordNet synsets
•Attaches sentiment-related information with synsets
![Page 8: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/8.jpg)
Quantifying sentiment
Objective Polarity
Subjective Polarity
Positive Negative
Term sense position
Each term has a Positive, Negative and Objective score. The scores sum to one.
![Page 9: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/9.jpg)
Building SentiWordNet
• Ln, Lo, Lp are the three seed sets• Iteratively expand the seed sets through K
steps• Train the classifier for the expanded sets
![Page 10: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/10.jpg)
LpLn
also-se
e
antonymy
Expansion of seed sets
The sets at the end of kth step are called Tr(k,p) and Tr(k,n)
Tr(k,o) is the set that is not present in Tr(k,p) and Tr(k,n)
![Page 11: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/11.jpg)
Committee of classifiers
• Train a committee of classifiers of different types and different K-values for the given data
• Observations:– Low values of K give high precision and low
recall– Accuracy in determining positivity or
negativity, however, remains almost constant
![Page 12: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/12.jpg)
WordNet Affect
• Similar to SentiWordNet (an earlier work)
• WordNet-Affect: WordNet + annotated affective concepts in hierarchical order
• Hierarchy called ‘affective domain labels’– behaviour– personality– cognitive state
![Page 13: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/13.jpg)
Subjectivity detection
• Aim: To extract subjective portions of text
• Algorithm used: Minimum cut algorithm
![Page 14: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/14.jpg)
Constructing the graph
• Why graphs?
• Nodes and edges?
• Individual Scores
• Association scoresTo model item-specific
and pairwise information independently.
• Why graphs?
• Nodes and edges?
• Individual Scores
• Association scores
Nodes: Sentences of the document and source & sink
Source & sink representthe two classes of sentences
Edges: Weighted with either of the two scores
• Why graphs?
• Nodes and edges?
• Individual Scores
• Association scoresPrediction whether
the sentence is subjective or not
Indsub(si)=
• Why graphs?
• Nodes and edges?
• Individual Scores
• Association scoresPrediction whether two
sentences should have
the same subjectivity level
T : Threshold – maximum distance upto which sentences may be considered proximalf: The decaying functioni, j : Position numbers
![Page 15: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/15.jpg)
Constructing the graph
• Build an undirected graph G with vertices {v1, v2…,s, t} (sentences and s,t)
• Add edges (s, vi) each with weight ind1(xi)
• Add edges (t, vi) each with weight ind2(xi)
• Add edges (vi, vk) with weight assoc (vi, vk)
• Partition cost:
![Page 16: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/16.jpg)
Example
Sample cuts:
![Page 17: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/17.jpg)
Document
Subjective
Results (1/2)
• Naïve Bayes, no extraction : 82.8%
• Naïve Bayes, subjective extraction : 86.4%
• Naïve Bayes, ‘flipped experiment’ : 71 %
DocumentSubjectivity
detectorObjective
POLARITY CLASSIFIER
![Page 18: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/18.jpg)
Results (2/2)
![Page 19: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/19.jpg)
Approach 1: Using adjectives
• Many adjectives have high sentiment value– A ‘beautiful’ bag– A ‘wooden’ bench– An ‘embarrassing’ performance
• An idea would be to augment this polarity information to adjectives in the WordNet
![Page 20: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/20.jpg)
Setup
• Two anchor words (extremes of the polarity spectrum) were chosen
• PMI of adjectives with respect to these adjectives is calculated
Polarity Score (W)= PMI(W,excellent) – PMI (W, poor)
excellent poor
wordPMI PMI
![Page 21: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/21.jpg)
Experimentation
• K-means clustering algorithm used on the basis of polarity scores
• The clusters contain words with similar polarities
• These words can be linked using an ‘isopolarity link’ in WordNet
![Page 22: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/22.jpg)
Results
• Three clusters seen
• Major words were with negative polarity scores
• The obscure words were removed by selecting adjectives with familiarity count of 3– the ones that are not very common
![Page 23: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/23.jpg)
Approach 2: Using Adverb-Adjective Combinations (AACs)
• Calculate sentiment value based on the effect of adverbs on adjectives
• Linguistic ideas:• Adverbs of affirmation: certainly• Adverbs of doubt: possibly• Strong intensifying adverbs: extremely• Weak intensifying adverbs: scarcely• Negation and Minimizers: never
![Page 24: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/24.jpg)
Moving towards computation…
• Based on type of adverb, the score of the resultant AAC will be affected
• Example of an axiom:
• Example : ‘extremely good’ is more positive than ‘good’
![Page 25: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/25.jpg)
AAC Scoring Algorithms
1. Variable Scoring Algorithm
2. Adjective Priority Scoring Algorithm
3. Adverb first scoring algorithm
![Page 26: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/26.jpg)
Scoring the sentiment on a topic
• Rel (t) : Sentences in d that reference to topic t
• s : Sentence is Rel (t)
• Appl+(s) : AACs with positive score in s
• Appl-(s) : AACs with negative score in s
• Return strength =
![Page 27: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/27.jpg)
Findings
• APSr with r=0.35 worked the best (Better correlation with human subject)– Adjectives are more important than adverbs in
terms of sentiment
• AACs give better precision and recall as compared to only adjectives
![Page 28: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/28.jpg)
Approach 3: Subject-based SA
• Examples:
The horse bolted.
The movie lacks a good story.
![Page 29: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/29.jpg)
Lexicon
subj. bolt
b VB bolt subj
subj. lack obj.
b VB lack obj ~subj
Argument that sends the sentiment (subj./obj.)
Argument that receives the sentiment (subj./obj.)
Argument that receives the sentiment (subj./obj.)
![Page 30: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/30.jpg)
Lexicon
• Also allows ‘\S+’ characters
• Similar to regular expressions
• E.g. to put \S+ to risk– The favorability of the subject depends on the
favorability of ‘\S+’.
![Page 31: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/31.jpg)
Example
The movie lacks a good story.
G JJ good obj.
The movie lacks \S+.
B VB lack obj ~subj.
Lexicon : Steps :
1) Consider a context window of upto five words
2) Shallow parse the sentence
3) Step-by-step calculate the sentiment value based on lexicon and by adding ‘\S+’ characters at each step
![Page 32: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/32.jpg)
Results
Description Precision Recall
Benchmark corpus
Mixed statements
94.3% 28%
Open Test corpus
Reviews of a camera
94% 24%
![Page 33: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/33.jpg)
Applications
• Review-related analysis
• Developing ‘hate mail filters’ analogous to ‘spam mail filters’
• Question-answering (Opinion-oriented questions may involve different treatment)
![Page 34: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/34.jpg)
Conclusion & Future Work
• Lexical Resources have been developed to capture sentiment-related nature
• Subjective extracts provide a better accuracy of sentiment prediction
• Several approaches use algorithms like Naïve Bayes, clustering, etc. to perform sentiment analysis
• The cognitive angle to Sentiment Analysis can be explored in the future
![Page 35: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/35.jpg)
References (1/2)
• Tetsuya Nasukawa, Jeonghee Yi. ‘Sentiment Analysis: Capturing Favorability Using Natural Language Processing’. In K-CAP ’03, Florida, pages 1-8. 2003.
• Alekh Agarwal, Pushpak Bhattacharyya. ‘Augmenting WordNet with polarity information on adjectives’. In K-CAP ’03, Florida, pages 1-8. 2003.
• SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining Andrea Esuli, Fabrizio Sebastiani
• ‘Machine Learning’, Han and Kamber, 2nd edition, 310-330.• http://wordnet.princeton.edu• Farah Benamara, Carmine Cesarano, Antonio Picariello, VS Subrahmanian
et al; ‘Sentiment Analysis: Adjectives and Adverbs are better than Adjectives Alone’; In ICWSM ’2007 Boulder, CO USA, 2007.
![Page 36: SA Sentiment Analysis Presented by Aditya Joshi 08305908 Guided by Prof. Pushpak Bhattacharyya IIT Bombay.](https://reader030.fdocuments.net/reader030/viewer/2022032805/56649ee75503460f94bf8c05/html5/thumbnails/36.jpg)
References (2/2)
• Jon M. Kleinberg; ‘Authoritative Sources in a Hyperlinked Environment’ as IBM Research Report RJ 10076, May 1997, Pgs. 1 – 34.
• www.cs.uah.edu/~jrushing/cs696-summer2004/notes/Ch8Supp.ppt• Opinion Mining and Sentiment Analysis, Foundations and Trends in
Information Retrieval, B. Pang and L. Lee, Vol. 2, Nos. 1–2 (2008) 1–135, 2008.
• Bo Pang, Lillian Lee; ‘A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts’; Proceedings of the 42nd ACL; pp. 271–278; 2004.
• http://www.cse.iitb.ac.in/~veeranna/ppt/Wordnet-Affect.ppt