Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent...

35
Outline Introduction Research Questions Related Work Proposed Methodology Summary and Future Works Sentiment Classification of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies Almashraee Mohammed Dagmar Monett-Diaz November 19, 2012 School of Economics and law HWR-Berlin Humboldt University Berlin

Transcript of Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent...

Page 1: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

Sentiment Classification of on-line Products basedon Machine Learning Techniques and Multi-agent

System Technologies

Almashraee Mohammed Dagmar Monett-Diaz

November 19, 2012

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 2: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

Introduction

Research Questions

Related Works

Proposed Methodology

Summary and Future Works

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 3: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

Social Media Sources... Why!?

Figure : Social Media Sources and Application DomainsSchool of Economics and law HWR-Berlin Humboldt University Berlin

Page 4: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

Opinion Mining/Sentiment Analysis... Why!?

Figure : Opinion Mining Research

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 5: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

Opinion Mining - A Classification Task

Figure : Sentiment Classification

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 6: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

Levels of Sentiment Analysis

Document Level

Single entity in eachdocument

Sentence Level

Single entity in each sentenceSubjectivety classification

(Entity - Aspects)/Featue Level

Opinion-based classification1 Sentiment (positive or

negative)2 Opinion target

Structured Summary

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 7: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

Levels of Sentiment Analysis

Document Level

Single entity in eachdocument

Sentence Level

Single entity in each sentenceSubjectivety classification

(Entity - Aspects)/Featue Level

Opinion-based classification1 Sentiment (positive or

negative)2 Opinion target

Structured Summary

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 8: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

Levels of Sentiment Analysis

Document Level

Single entity in eachdocument

Sentence Level

Single entity in each sentenceSubjectivety classification

(Entity - Aspects)/Featue Level

Opinion-based classification1 Sentiment (positive or

negative)2 Opinion target

Structured Summary

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 9: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

Levels of Sentiment Analysis

Document Level

Single entity in eachdocument

Sentence Level

Single entity in each sentenceSubjectivety classification

(Entity - Aspects)/Featue Level

Opinion-based classification1 Sentiment (positive or

negative)2 Opinion target

Structured Summary

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 10: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 11: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

1 To what extent MAS and ML are able to solve the problem ofproduct sentiment classification?

2 What is the way and form in which agents shouldcommunicate among each other?

3 How to merge the results from different sources?

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 12: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

1 To what extent MAS and ML are able to solve the problem ofproduct sentiment classification?

2 What is the way and form in which agents shouldcommunicate among each other?

3 How to merge the results from different sources?

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 13: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

1 To what extent MAS and ML are able to solve the problem ofproduct sentiment classification?

2 What is the way and form in which agents shouldcommunicate among each other?

3 How to merge the results from different sources?

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 14: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

1 To what extent MAS and ML are able to solve the problem ofproduct sentiment classification?

2 What is the way and form in which agents shouldcommunicate among each other?

3 How to merge the results from different sources?

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 15: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

[Pang et al, 2002]

Features #features Freq or pres NB ME SVM

unigrams 16165. freq. 78.7 Non 72.8unigrams 16165 pres. 81.0 80.4 82.9

unigrams + bigrams 32330 pres. 80.6 80.8 82.7bigrams 16165 pres. 77.3 77.4 77.1

unigrams+POS 16695 pres. 81.5 80.4 81.9adjectives 2633 pres. 77.0 77.7 75.1

top 2633 unigrams 2633 pres. 80.3 81.0 81.4unigrams+position 22430 pres. 81.0 80.1 81.6

Table : Machine Learning Features for Classification

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 16: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

[Pang et al, 2002]

Features #features Freq or pres NB ME SVMunigrams 16165. freq. 78.7 Non 72.8

unigrams 16165 pres. 81.0 80.4 82.9unigrams + bigrams 32330 pres. 80.6 80.8 82.7

bigrams 16165 pres. 77.3 77.4 77.1unigrams+POS 16695 pres. 81.5 80.4 81.9

adjectives 2633 pres. 77.0 77.7 75.1top 2633 unigrams 2633 pres. 80.3 81.0 81.4unigrams+position 22430 pres. 81.0 80.1 81.6

Table : Machine Learning Features for Classification

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 17: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

[Pang et al, 2002]

Features #features Freq or pres NB ME SVMunigrams 16165. freq. 78.7 Non 72.8unigrams 16165 pres. 81.0 80.4 82.9

unigrams + bigrams 32330 pres. 80.6 80.8 82.7bigrams 16165 pres. 77.3 77.4 77.1

unigrams+POS 16695 pres. 81.5 80.4 81.9adjectives 2633 pres. 77.0 77.7 75.1

top 2633 unigrams 2633 pres. 80.3 81.0 81.4unigrams+position 22430 pres. 81.0 80.1 81.6

Table : Machine Learning Features for Classification

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 18: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

[Pang et al, 2002]

Features #features Freq or pres NB ME SVMunigrams 16165. freq. 78.7 Non 72.8unigrams 16165 pres. 81.0 80.4 82.9

unigrams + bigrams 32330 pres. 80.6 80.8 82.7

bigrams 16165 pres. 77.3 77.4 77.1unigrams+POS 16695 pres. 81.5 80.4 81.9

adjectives 2633 pres. 77.0 77.7 75.1top 2633 unigrams 2633 pres. 80.3 81.0 81.4unigrams+position 22430 pres. 81.0 80.1 81.6

Table : Machine Learning Features for Classification

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 19: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

[Pang et al, 2002]

Features #features Freq or pres NB ME SVMunigrams 16165. freq. 78.7 Non 72.8unigrams 16165 pres. 81.0 80.4 82.9

unigrams + bigrams 32330 pres. 80.6 80.8 82.7bigrams 16165 pres. 77.3 77.4 77.1

unigrams+POS 16695 pres. 81.5 80.4 81.9adjectives 2633 pres. 77.0 77.7 75.1

top 2633 unigrams 2633 pres. 80.3 81.0 81.4unigrams+position 22430 pres. 81.0 80.1 81.6

Table : Machine Learning Features for Classification

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 20: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

[Pang et al, 2002]

Features #features Freq or pres NB ME SVMunigrams 16165. freq. 78.7 Non 72.8unigrams 16165 pres. 81.0 80.4 82.9

unigrams + bigrams 32330 pres. 80.6 80.8 82.7bigrams 16165 pres. 77.3 77.4 77.1

unigrams+POS 16695 pres. 81.5 80.4 81.9

adjectives 2633 pres. 77.0 77.7 75.1top 2633 unigrams 2633 pres. 80.3 81.0 81.4unigrams+position 22430 pres. 81.0 80.1 81.6

Table : Machine Learning Features for Classification

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 21: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

[Pang et al, 2002]

Features #features Freq or pres NB ME SVMunigrams 16165. freq. 78.7 Non 72.8unigrams 16165 pres. 81.0 80.4 82.9

unigrams + bigrams 32330 pres. 80.6 80.8 82.7bigrams 16165 pres. 77.3 77.4 77.1

unigrams+POS 16695 pres. 81.5 80.4 81.9adjectives 2633 pres. 77.0 77.7 75.1

top 2633 unigrams 2633 pres. 80.3 81.0 81.4unigrams+position 22430 pres. 81.0 80.1 81.6

Table : Machine Learning Features for Classification

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 22: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

[Pang et al, 2002]

Features #features Freq or pres NB ME SVMunigrams 16165. freq. 78.7 Non 72.8unigrams 16165 pres. 81.0 80.4 82.9

unigrams + bigrams 32330 pres. 80.6 80.8 82.7bigrams 16165 pres. 77.3 77.4 77.1

unigrams+POS 16695 pres. 81.5 80.4 81.9adjectives 2633 pres. 77.0 77.7 75.1

top 2633 unigrams 2633 pres. 80.3 81.0 81.4

unigrams+position 22430 pres. 81.0 80.1 81.6

Table : Machine Learning Features for Classification

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 23: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

[Pang et al, 2002]

Features #features Freq or pres NB ME SVMunigrams 16165. freq. 78.7 Non 72.8unigrams 16165 pres. 81.0 80.4 82.9

unigrams + bigrams 32330 pres. 80.6 80.8 82.7bigrams 16165 pres. 77.3 77.4 77.1

unigrams+POS 16695 pres. 81.5 80.4 81.9adjectives 2633 pres. 77.0 77.7 75.1

top 2633 unigrams 2633 pres. 80.3 81.0 81.4unigrams+position 22430 pres. 81.0 80.1 81.6

Table : Machine Learning Features for Classification

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 24: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

[Liu et al. 2005]

Opinion Observer prototype

supervised rule-discovery techniques

[Hu and Liu, a,b 2004]

frequent features of product based on association rules mining.

[Schweitzer and Garcia 2010]

Fast and high search efficiency based on MAS

Relatedness between gathered pages and users request

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 25: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

[Liu et al. 2005]

Opinion Observer prototype

supervised rule-discovery techniques

[Hu and Liu, a,b 2004]

frequent features of product based on association rules mining.

[Schweitzer and Garcia 2010]

Fast and high search efficiency based on MAS

Relatedness between gathered pages and users request

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 26: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

[Liu et al. 2005]

Opinion Observer prototype

supervised rule-discovery techniques

[Hu and Liu, a,b 2004]

frequent features of product based on association rules mining.

[Schweitzer and Garcia 2010]

Fast and high search efficiency based on MAS

Relatedness between gathered pages and users request

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 27: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

Figure : Multi-agent system based opinion mining

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 28: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

System Input:A user product + relevant corpuses collected from different socialsources using several agents.

System Output: A summary of the product features with theirassociated polarities.

The idea:

1 Information retrieval and representation2 Relatedness between gathered pages and users request

Feature extractionPolarity classification

3 Learning extra features

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 29: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

System Input:A user product + relevant corpuses collected from different socialsources using several agents.

System Output: A summary of the product features with theirassociated polarities.

The idea:

1 Information retrieval and representation2 Relatedness between gathered pages and users request

Feature extractionPolarity classification

3 Learning extra features

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 30: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

System Input:A user product + relevant corpuses collected from different socialsources using several agents.

System Output: A summary of the product features with theirassociated polarities.

The idea:

1 Information retrieval and representation2 Relatedness between gathered pages and users request

Feature extractionPolarity classification

3 Learning extra features

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 31: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

TOMA

FOMATable summary

Doc1 Doc2 Doc3

product name 1 0 1

picture quality 1 -1 1

weight 0 0 0

battery life 1 1 -1

price 1 0 1

zoom -1 1 -1

Table : results of TOMA

Doc1 Doc2 Doc3

product name 1 0 1

picture quality 1 -1 1

weight 1 1 -1

battery life 1 1 -1

price 1 0 1

zoom 0 0 0

Table : results of FOMA

Doc1 Doc2 Doc3 Doc4 Doc5 Doc6

product name 1 0 1 1 0 1

picture quality 1 -1 1 1 -1 1

weight 1 1 -1 1 1 -1

battery life 1 1 -1 1 1 -1

price 1 0 1 1 0 1

zoom -1 1 -1 -1 1 -1

Table : Investigation results ofTOMA

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 32: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

TOMAFOMA

Table summary

Doc1 Doc2 Doc3

product name 1 0 1

picture quality 1 -1 1

weight 0 0 0

battery life 1 1 -1

price 1 0 1

zoom -1 1 -1

Table : results of TOMA

Doc1 Doc2 Doc3

product name 1 0 1

picture quality 1 -1 1

weight 1 1 -1

battery life 1 1 -1

price 1 0 1

zoom 0 0 0

Table : results of FOMA

Doc1 Doc2 Doc3 Doc4 Doc5 Doc6

product name 1 0 1 1 0 1

picture quality 1 -1 1 1 -1 1

weight 1 1 -1 1 1 -1

battery life 1 1 -1 1 1 -1

price 1 0 1 1 0 1

zoom -1 1 -1 -1 1 -1

Table : Investigation results ofTOMA

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 33: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

TOMAFOMATable summary

Doc1 Doc2 Doc3

product name 1 0 1

picture quality 1 -1 1

weight 0 0 0

battery life 1 1 -1

price 1 0 1

zoom -1 1 -1

Table : results of TOMA

Doc1 Doc2 Doc3

product name 1 0 1

picture quality 1 -1 1

weight 1 1 -1

battery life 1 1 -1

price 1 0 1

zoom 0 0 0

Table : results of FOMA

Doc1 Doc2 Doc3 Doc4 Doc5 Doc6

product name 1 0 1 1 0 1

picture quality 1 -1 1 1 -1 1

weight 1 1 -1 1 1 -1

battery life 1 1 -1 1 1 -1

price 1 0 1 1 0 1

zoom -1 1 -1 -1 1 -1

Table : Investigation results ofTOMA

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 34: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

Figure : Multi-agent system based opinion mining

School of Economics and law HWR-Berlin Humboldt University Berlin

Page 35: Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent System Technologies

OutlineIntroduction

Research QuestionsRelated Work

Proposed MethodologySummary and Future Works

Thank You

School of Economics and law HWR-Berlin Humboldt University Berlin