Unsupervised Sentiment Analysis

28
Taras Zagibalov© 2009 Taras Zagibalov [email protected] PhD candidate at University of Sussex Brighton, UK Ford Foundation International Fellowship fellow Natural languages: Russian, English, Mandarin Programming: Java, Prolog

description

What is sentiment analysis? How can it be used in business? What is a potential of UNSUPERVISED sentiment analysis?

Transcript of Unsupervised Sentiment Analysis

Page 1: Unsupervised Sentiment Analysis

Taras Zagibalov© 2009

Taras [email protected]

PhD candidate at University of SussexBrighton, UK

Ford Foundation International Fellowship fellowNatural languages: Russian, English, Mandarin

Programming: Java, Prolog

Page 2: Unsupervised Sentiment Analysis

Taras Zagibalov© 2009

Unsupervised Sentiment Analysis

Listening to the Word of MouthListening to the Word of Mouth

What is it?How does it work?

How can it be used?

Page 3: Unsupervised Sentiment Analysis

Taras Zagibalov© 2009

Outline

What is Sentiment Analysis Application of Sentiment Analysis Who's in the business? Unsolved Problems Why unsupervised? Is it effective?

Page 4: Unsupervised Sentiment Analysis

Taras Zagibalov© 2009

Sentiment Analysis

Sentiment Analysis (or Opinion Mining) is a relatively new research area in Information

Retrieval and Natural Language Processing, which is concerned not with a document's topic,

but with what opinion it expresses

Page 5: Unsupervised Sentiment Analysis

Taras Zagibalov© 2009

What is Sentiment Analysis

Subjectivity Classification Orientation Detection Opinion Holder and Target Extraction Feature-Based Opinion Mining

Page 6: Unsupervised Sentiment Analysis

Taras Zagibalov© 2009

What is Sentiment Analysis

Subjectivity Classification Orientation Detection Opinion Holder and Target Extraction "Feature-Based Opinion Mining"

A car has four wheels.

vs

It's a good car.

Page 7: Unsupervised Sentiment Analysis

Taras Zagibalov© 2009

What is Sentiment Analysis

Subjectivity Classification Orientation Detection Opinion Holder and Target Extraction "Feature-Based Opinion Mining"

It's a good car.

vs

It's a bad car.

Page 8: Unsupervised Sentiment Analysis

Taras Zagibalov© 2009

What is Sentiment Analysis

Subjectivity Classification Orientation Detection Opinion Holder and Target Extraction "Feature-Based Opinion Mining"

Ian says it's a good car.

Page 9: Unsupervised Sentiment Analysis

Taras Zagibalov© 2009

What is Sentiment Analysis

Subjectivity Classification Orientation Detection Opinion Holder and Target Extraction "Feature-Based Opinion Mining"

The wheels are good, but all the rest is just unusable.

Page 10: Unsupervised Sentiment Analysis

Taras Zagibalov© 2009

Application of Sentiment Analysis

Where opinions can be found?

News feeds (Google, Yahoo, Reuters etc) Blogs (LJ, Technorati etc) Social Networks (Twitter, Facebook...) Customer review sites (Amazon, eBay...)

Page 11: Unsupervised Sentiment Analysis

Taras Zagibalov© 2009

Application of Sentiment Analysis

Marketing Research Product Reviews Analysis Brand Tracking Influence Analysis

Public Opinion Tracking Customer correspondence analysis

Page 12: Unsupervised Sentiment Analysis

Taras Zagibalov© 2009

Application of Sentiment Analysis

What questions can be answered by Sentiment analysis system?

What do customers think about our product? Which of our customers are unsatisfied? What features of our product are the worst? Who and how influences our image? What is public reaction to (some event or

some person)? and so on...

Page 13: Unsupervised Sentiment Analysis

Taras Zagibalov© 2009

Example 1

On-line (blogs, mass-media) monitoring of a product promotion campaigns

Promotional campaign A is successful as most of on-line reviews are positive.

Promotional campaign B needs immediate actions as most of on-line reviews are negative.

A B

0

1

2

3

4

5

6

7

8

9

10

Page 14: Unsupervised Sentiment Analysis

Taras Zagibalov© 2009

Example 2

New product release as it mirrored in customer on-line reviews

(A) Product release and add campaign is quite effective as public opinion is mostly positive. But the sentiment changes as sales grow (B), more people are unsatisfied and it needs to be analysed (probably some quality-related issues)

A B

0

1

2

3

4

5

6

7

8

Page 15: Unsupervised Sentiment Analysis

Taras Zagibalov© 2009

Example 3

Influence analysis by tracking blogs

(A) Negative review in a newspaper does not affect a generally positive sentiment towards a product, although a positive review in a magazine (B) is quite effective.

A B

0

1

2

3

4

5

6

7

8

9

Page 16: Unsupervised Sentiment Analysis

Taras Zagibalov© 2009

Who's in the business?

BrandWatch Istrategy Labs Cataphora Scoutlabs Lexalytics Infonic Attensity Open Dover ...

Page 17: Unsupervised Sentiment Analysis

Taras Zagibalov© 2009

What's the technology?

Machine Learning Manually tagged training data sets User-tagged training data sets (“thumbs up” and the

“ five stars”)

Knowledge-based Approaches Manually created word-lists Generic word-lists (like SentiWordNet or sentiment

vocabularies)

Manual Processing

Page 18: Unsupervised Sentiment Analysis

Taras Zagibalov© 2009

Unsolved Problems

Domain-dependency Unpredictable evaluation language Language-dependency

Page 19: Unsupervised Sentiment Analysis

Taras Zagibalov© 2009

Unsolved Problems

Domain-dependency Unpredictable evaluation language Language-dependency

"The plot was unpredictable"

vs

"the steering was unpredictable"

Page 20: Unsupervised Sentiment Analysis

Taras Zagibalov© 2009

Unsolved Problems

Domain-dependency Unpredictable evaluation language Language-dependency

“good” == “bad” in eBay

“3G” (technology for mobile phones) == “good”

Page 21: Unsupervised Sentiment Analysis

Taras Zagibalov© 2009

Unsolved Problems

Domain-dependency Unpredictable evaluation language Language-dependency

Culture-related issues (“good” <> “ ”好 )

Language-related issues (SVO vs SOV)

Page 22: Unsupervised Sentiment Analysis

Taras Zagibalov© 2009

Why unsupervised?

Cross-Domain applicability Multi-Lingual applicability Cheap Start

Page 23: Unsupervised Sentiment Analysis

Taras Zagibalov© 2009

Why unsupervised?

Cross-Domain applicability Multi-Lingual applicability Cheap Start

No expensive human annotation needed: all information is found in the documents which needed to be processed.All extracted information is domain-specific and free from noise produced by “generic” word lists and wordnets.

Page 24: Unsupervised Sentiment Analysis

Taras Zagibalov© 2009

Why unsupervised?

Cross-Domain applicability Multi-Lingual applicability Cheap Start

Unsupervised systems, being data-independent, can be easily ported to almost any language.

Page 25: Unsupervised Sentiment Analysis

Taras Zagibalov© 2009

Why unsupervised?

Cross-Domain applicability Multi-Lingual applicability Cheap Start

Once an unsupervised system is developed it can be applied to new data almost immediately saving costs of data labelling and/or rules (word-lists) writing up.

Page 26: Unsupervised Sentiment Analysis

Taras Zagibalov© 2009

Is it effective?

The unsupervised approach was tested on different language corpora (English, Simplified Chinese, Traditional Chinese, Japanese) and in many cases compared reasonably well with supervised methods.

Results were presented on some major international scientific conferences (ACL, IJCNLP, COLING, NTCIR).

Page 27: Unsupervised Sentiment Analysis

Taras Zagibalov© 2009

Is it effective?

The approach can be easily combined with supervised techniques:

Unsupervised system can provide initial data for in-depth research of the data (building up word-lists and rule-sets)

Automatically extracted information can be used for training machine learning systems.

Page 28: Unsupervised Sentiment Analysis

Taras Zagibalov© 2009

Conclusion

Unsupervised Sentiment Analysis is an efficient instument of keeping track of public opinion in different domains and languages.

It can be used as an entry point to a new domain or language.

It can be combined with supervised methods to increase accuracy.