Selecting Attributes for Sentiment Classification Using Feature Relation Networks
description
Transcript of Selecting Attributes for Sentiment Classification Using Feature Relation Networks
Intelligent Database Systems Lab
Presenter : JIAN-REN CHEN
Authors : Ahmed Abbasi, Stephen France, Zhu Zhang,
and Hsinchun Chen
2011 , IEEE TKDE
Selecting Attributes for Sentiment Classification Using Feature Relation Networks
Intelligent Database Systems Lab
OutlinesMotivationObjectivesMethodologyExperimentsConclusionsComments
Intelligent Database Systems Lab
MotivationSentiment analysis has emerged as a method for
mining opinions from such text archives.
challenging problem:
1. requires the use of large quantities of linguistic features
2. integrate these heterogeneous n-gram categories into a single
feature set
- noise 、 redundancy and computational limitations
1) polarity 2) intensityI don’t like you 、 I hate you
Intelligent Database Systems Lab
n-gram - (Markov model)天氣:晴天、陰天、雨天美麗 vs 美痢
“HAPAX” and “DIS” tagsI hate Jimreplaced with “I hate HAPAX”
Intelligent Database Systems Lab
Objectives• Feature Relation Network (FRN) considers semantic information
and also leverages the syntactic relationships between n-gram
features.
- enhanced sentiment classification on extended sets of
heterogeneous n-gram features.
Intelligent Database Systems Lab
Methodology-Extended N-Gram Feature Set
Intelligent Database Systems Lab
Methodology - Subsumption Relations
A subsumes B(A → B) “I love chocolate”
unigram : I, LOVE, CHOCOLATE bigrams : I LOVE, LOVE CHOCOLATE trigrams : I LOVE CHOCOLATE
W hat about the bigrams and trigrams?It depends on their weight.Their weight exceeds that of their general lower order counterparts by threshold t.
Intelligent Database Systems Lab
Methodology - Parallel RelationsA parallel B (A - B)
POS tag: “ADMIRE_VP” → “ like” semantic class: “SYN-Affection” → “ love”
A and B have a correlation coefficient greater than some threshold p, one of the attributes is removed to avoid redundancy.
Intelligent Database Systems Lab
Methodology - The Complete Network
Intelligent Database Systems Lab
Methodology - Incorporating Semantic Information
Intelligent Database Systems Lab
Experiments - Datasets
Intelligent Database Systems Lab
Experiments – FRN vs Univariate
Intelligent Database Systems Lab
Experiments - FRN vs Univariate (WithinOne)
Intelligent Database Systems Lab
Experiments - FRN vs Multivariate
Intelligent Database Systems Lab
Experiments - FRN vs Multivariate (WithinOne)
Intelligent Database Systems Lab
Experiments - FRN vs Hybrid
Intelligent Database Systems Lab
Experiments - FRN vs Hybrid (WithinOne)
Intelligent Database Systems Lab
Experiments - Ablation
Intelligent Database Systems Lab
Experiments - Parametert (0.0005, 0.005, 0.05, and 0.5)p (0.80, 0.90, and 1.00)
Intelligent Database Systems Lab
Experiments - Average Runtimes
Intelligent Database Systems Lab
Conclusions• FRN had significantly higher best accuracy and best
percentage within-one across three testbeds.
• The ablation and parameter testing results play an
important role for the subsumption and parallel
relation thresholds.
Intelligent Database Systems Lab
Comments• Advantages
- accuracy 、 computationally efficient• Disadvantage
- ablation and parameter is sensitive• Applications
- sentiment classification- feature selection method