Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent...
-
Upload
almashraee -
Category
Documents
-
view
153 -
download
3
Transcript of Sentiment Classication of on-line Products based on Machine Learning Techniques and Multi-agent...
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
Sentiment Classification of on-line Products basedon Machine Learning Techniques and Multi-agent
System Technologies
Almashraee Mohammed Dagmar Monett-Diaz
November 19, 2012
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
Introduction
Research Questions
Related Works
Proposed Methodology
Summary and Future Works
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
Social Media Sources... Why!?
Figure : Social Media Sources and Application DomainsSchool of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
Opinion Mining/Sentiment Analysis... Why!?
Figure : Opinion Mining Research
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
Opinion Mining - A Classification Task
Figure : Sentiment Classification
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
Levels of Sentiment Analysis
Document Level
Single entity in eachdocument
Sentence Level
Single entity in each sentenceSubjectivety classification
(Entity - Aspects)/Featue Level
Opinion-based classification1 Sentiment (positive or
negative)2 Opinion target
Structured Summary
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
Levels of Sentiment Analysis
Document Level
Single entity in eachdocument
Sentence Level
Single entity in each sentenceSubjectivety classification
(Entity - Aspects)/Featue Level
Opinion-based classification1 Sentiment (positive or
negative)2 Opinion target
Structured Summary
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
Levels of Sentiment Analysis
Document Level
Single entity in eachdocument
Sentence Level
Single entity in each sentenceSubjectivety classification
(Entity - Aspects)/Featue Level
Opinion-based classification1 Sentiment (positive or
negative)2 Opinion target
Structured Summary
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
Levels of Sentiment Analysis
Document Level
Single entity in eachdocument
Sentence Level
Single entity in each sentenceSubjectivety classification
(Entity - Aspects)/Featue Level
Opinion-based classification1 Sentiment (positive or
negative)2 Opinion target
Structured Summary
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
1 To what extent MAS and ML are able to solve the problem ofproduct sentiment classification?
2 What is the way and form in which agents shouldcommunicate among each other?
3 How to merge the results from different sources?
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
1 To what extent MAS and ML are able to solve the problem ofproduct sentiment classification?
2 What is the way and form in which agents shouldcommunicate among each other?
3 How to merge the results from different sources?
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
1 To what extent MAS and ML are able to solve the problem ofproduct sentiment classification?
2 What is the way and form in which agents shouldcommunicate among each other?
3 How to merge the results from different sources?
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
1 To what extent MAS and ML are able to solve the problem ofproduct sentiment classification?
2 What is the way and form in which agents shouldcommunicate among each other?
3 How to merge the results from different sources?
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
[Pang et al, 2002]
Features #features Freq or pres NB ME SVM
unigrams 16165. freq. 78.7 Non 72.8unigrams 16165 pres. 81.0 80.4 82.9
unigrams + bigrams 32330 pres. 80.6 80.8 82.7bigrams 16165 pres. 77.3 77.4 77.1
unigrams+POS 16695 pres. 81.5 80.4 81.9adjectives 2633 pres. 77.0 77.7 75.1
top 2633 unigrams 2633 pres. 80.3 81.0 81.4unigrams+position 22430 pres. 81.0 80.1 81.6
Table : Machine Learning Features for Classification
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
[Pang et al, 2002]
Features #features Freq or pres NB ME SVMunigrams 16165. freq. 78.7 Non 72.8
unigrams 16165 pres. 81.0 80.4 82.9unigrams + bigrams 32330 pres. 80.6 80.8 82.7
bigrams 16165 pres. 77.3 77.4 77.1unigrams+POS 16695 pres. 81.5 80.4 81.9
adjectives 2633 pres. 77.0 77.7 75.1top 2633 unigrams 2633 pres. 80.3 81.0 81.4unigrams+position 22430 pres. 81.0 80.1 81.6
Table : Machine Learning Features for Classification
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
[Pang et al, 2002]
Features #features Freq or pres NB ME SVMunigrams 16165. freq. 78.7 Non 72.8unigrams 16165 pres. 81.0 80.4 82.9
unigrams + bigrams 32330 pres. 80.6 80.8 82.7bigrams 16165 pres. 77.3 77.4 77.1
unigrams+POS 16695 pres. 81.5 80.4 81.9adjectives 2633 pres. 77.0 77.7 75.1
top 2633 unigrams 2633 pres. 80.3 81.0 81.4unigrams+position 22430 pres. 81.0 80.1 81.6
Table : Machine Learning Features for Classification
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
[Pang et al, 2002]
Features #features Freq or pres NB ME SVMunigrams 16165. freq. 78.7 Non 72.8unigrams 16165 pres. 81.0 80.4 82.9
unigrams + bigrams 32330 pres. 80.6 80.8 82.7
bigrams 16165 pres. 77.3 77.4 77.1unigrams+POS 16695 pres. 81.5 80.4 81.9
adjectives 2633 pres. 77.0 77.7 75.1top 2633 unigrams 2633 pres. 80.3 81.0 81.4unigrams+position 22430 pres. 81.0 80.1 81.6
Table : Machine Learning Features for Classification
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
[Pang et al, 2002]
Features #features Freq or pres NB ME SVMunigrams 16165. freq. 78.7 Non 72.8unigrams 16165 pres. 81.0 80.4 82.9
unigrams + bigrams 32330 pres. 80.6 80.8 82.7bigrams 16165 pres. 77.3 77.4 77.1
unigrams+POS 16695 pres. 81.5 80.4 81.9adjectives 2633 pres. 77.0 77.7 75.1
top 2633 unigrams 2633 pres. 80.3 81.0 81.4unigrams+position 22430 pres. 81.0 80.1 81.6
Table : Machine Learning Features for Classification
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
[Pang et al, 2002]
Features #features Freq or pres NB ME SVMunigrams 16165. freq. 78.7 Non 72.8unigrams 16165 pres. 81.0 80.4 82.9
unigrams + bigrams 32330 pres. 80.6 80.8 82.7bigrams 16165 pres. 77.3 77.4 77.1
unigrams+POS 16695 pres. 81.5 80.4 81.9
adjectives 2633 pres. 77.0 77.7 75.1top 2633 unigrams 2633 pres. 80.3 81.0 81.4unigrams+position 22430 pres. 81.0 80.1 81.6
Table : Machine Learning Features for Classification
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
[Pang et al, 2002]
Features #features Freq or pres NB ME SVMunigrams 16165. freq. 78.7 Non 72.8unigrams 16165 pres. 81.0 80.4 82.9
unigrams + bigrams 32330 pres. 80.6 80.8 82.7bigrams 16165 pres. 77.3 77.4 77.1
unigrams+POS 16695 pres. 81.5 80.4 81.9adjectives 2633 pres. 77.0 77.7 75.1
top 2633 unigrams 2633 pres. 80.3 81.0 81.4unigrams+position 22430 pres. 81.0 80.1 81.6
Table : Machine Learning Features for Classification
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
[Pang et al, 2002]
Features #features Freq or pres NB ME SVMunigrams 16165. freq. 78.7 Non 72.8unigrams 16165 pres. 81.0 80.4 82.9
unigrams + bigrams 32330 pres. 80.6 80.8 82.7bigrams 16165 pres. 77.3 77.4 77.1
unigrams+POS 16695 pres. 81.5 80.4 81.9adjectives 2633 pres. 77.0 77.7 75.1
top 2633 unigrams 2633 pres. 80.3 81.0 81.4
unigrams+position 22430 pres. 81.0 80.1 81.6
Table : Machine Learning Features for Classification
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
[Pang et al, 2002]
Features #features Freq or pres NB ME SVMunigrams 16165. freq. 78.7 Non 72.8unigrams 16165 pres. 81.0 80.4 82.9
unigrams + bigrams 32330 pres. 80.6 80.8 82.7bigrams 16165 pres. 77.3 77.4 77.1
unigrams+POS 16695 pres. 81.5 80.4 81.9adjectives 2633 pres. 77.0 77.7 75.1
top 2633 unigrams 2633 pres. 80.3 81.0 81.4unigrams+position 22430 pres. 81.0 80.1 81.6
Table : Machine Learning Features for Classification
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
[Liu et al. 2005]
Opinion Observer prototype
supervised rule-discovery techniques
[Hu and Liu, a,b 2004]
frequent features of product based on association rules mining.
[Schweitzer and Garcia 2010]
Fast and high search efficiency based on MAS
Relatedness between gathered pages and users request
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
[Liu et al. 2005]
Opinion Observer prototype
supervised rule-discovery techniques
[Hu and Liu, a,b 2004]
frequent features of product based on association rules mining.
[Schweitzer and Garcia 2010]
Fast and high search efficiency based on MAS
Relatedness between gathered pages and users request
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
[Liu et al. 2005]
Opinion Observer prototype
supervised rule-discovery techniques
[Hu and Liu, a,b 2004]
frequent features of product based on association rules mining.
[Schweitzer and Garcia 2010]
Fast and high search efficiency based on MAS
Relatedness between gathered pages and users request
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
Figure : Multi-agent system based opinion mining
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
System Input:A user product + relevant corpuses collected from different socialsources using several agents.
System Output: A summary of the product features with theirassociated polarities.
The idea:
1 Information retrieval and representation2 Relatedness between gathered pages and users request
Feature extractionPolarity classification
3 Learning extra features
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
System Input:A user product + relevant corpuses collected from different socialsources using several agents.
System Output: A summary of the product features with theirassociated polarities.
The idea:
1 Information retrieval and representation2 Relatedness between gathered pages and users request
Feature extractionPolarity classification
3 Learning extra features
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
System Input:A user product + relevant corpuses collected from different socialsources using several agents.
System Output: A summary of the product features with theirassociated polarities.
The idea:
1 Information retrieval and representation2 Relatedness between gathered pages and users request
Feature extractionPolarity classification
3 Learning extra features
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
TOMA
FOMATable summary
Doc1 Doc2 Doc3
product name 1 0 1
picture quality 1 -1 1
weight 0 0 0
battery life 1 1 -1
price 1 0 1
zoom -1 1 -1
Table : results of TOMA
Doc1 Doc2 Doc3
product name 1 0 1
picture quality 1 -1 1
weight 1 1 -1
battery life 1 1 -1
price 1 0 1
zoom 0 0 0
Table : results of FOMA
Doc1 Doc2 Doc3 Doc4 Doc5 Doc6
product name 1 0 1 1 0 1
picture quality 1 -1 1 1 -1 1
weight 1 1 -1 1 1 -1
battery life 1 1 -1 1 1 -1
price 1 0 1 1 0 1
zoom -1 1 -1 -1 1 -1
Table : Investigation results ofTOMA
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
TOMAFOMA
Table summary
Doc1 Doc2 Doc3
product name 1 0 1
picture quality 1 -1 1
weight 0 0 0
battery life 1 1 -1
price 1 0 1
zoom -1 1 -1
Table : results of TOMA
Doc1 Doc2 Doc3
product name 1 0 1
picture quality 1 -1 1
weight 1 1 -1
battery life 1 1 -1
price 1 0 1
zoom 0 0 0
Table : results of FOMA
Doc1 Doc2 Doc3 Doc4 Doc5 Doc6
product name 1 0 1 1 0 1
picture quality 1 -1 1 1 -1 1
weight 1 1 -1 1 1 -1
battery life 1 1 -1 1 1 -1
price 1 0 1 1 0 1
zoom -1 1 -1 -1 1 -1
Table : Investigation results ofTOMA
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
TOMAFOMATable summary
Doc1 Doc2 Doc3
product name 1 0 1
picture quality 1 -1 1
weight 0 0 0
battery life 1 1 -1
price 1 0 1
zoom -1 1 -1
Table : results of TOMA
Doc1 Doc2 Doc3
product name 1 0 1
picture quality 1 -1 1
weight 1 1 -1
battery life 1 1 -1
price 1 0 1
zoom 0 0 0
Table : results of FOMA
Doc1 Doc2 Doc3 Doc4 Doc5 Doc6
product name 1 0 1 1 0 1
picture quality 1 -1 1 1 -1 1
weight 1 1 -1 1 1 -1
battery life 1 1 -1 1 1 -1
price 1 0 1 1 0 1
zoom -1 1 -1 -1 1 -1
Table : Investigation results ofTOMA
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
Figure : Multi-agent system based opinion mining
School of Economics and law HWR-Berlin Humboldt University Berlin
OutlineIntroduction
Research QuestionsRelated Work
Proposed MethodologySummary and Future Works
Thank You
School of Economics and law HWR-Berlin Humboldt University Berlin