1 KDD Cup Survey Xinyue Liu. 2 Outline Nuts and Bolts of KDD Cup KDD Cup 97-99 KDD Cup 2000 Summary.
1 Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification (ACM KDD...
-
Upload
julius-welch -
Category
Documents
-
view
224 -
download
4
Transcript of 1 Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification (ACM KDD...
1
Sentiment Analysis of Blogs by Combining Lexical Knowledge
with Text Classification
(ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence
Date: 07/07/09Speaker: Hsu, Yu-Wen
Advisor: Dr. Koh, Jia-Ling
2
Outline
IntroductionBaseline ApproachesPooling MultinomialsEmpirical EvaluationConclusion & Future Works
3
Introduction
Most prior work in sentiment analysis use knowledge-based approaches, that classify the sentiment of texts based on dictionaries defining the sentiment-polarity of words, and simple linguistic patterns.
Recently, there have been some studies that take a machine learning approach, and build text classifiers trained on documents that have been human-labeled as positive or negative. not adapt well to different domains require much effort in human annotation of documents.
4
We present a new machine learning approach that overcomes these drawbacks by effectively combining background lexical knowledge with supervised learning.
We construct a generative model based on a lexicon of sentiment-laden words, and a second model trained on labeled documents. The distributions from these two models are then adapti
vely pooled to create a composite multinomial Naïve Bayes classifier that captures both sources of information.
5
Baseline Approaches
Lexical classification Given a lexicon of positive and negative terms,
one straightforward approach to using this information is to measure the frequency of occurrence of these terms in each document.
Feature supervision To use a lexicon along with unlabeled data in a
semi-supervised learning setting. There have been few approaches to incorporating such background knowledge into learning.
6
Pooling Multinomials
The multinomial Naïve Bayes classifier commonly used for text categorization relies on three assumption (1)documents are produced by a mixture model (2) there is a one-to-one correspondence betwe
en each mixture component and class (3) each mixture component is a multinomial dis
tribution of words, i.e., given a class, the words in a document are produced independently of each other.
7
the likelihood of a document (D)
: the probability of the class : the probability of the document
given the classthe words of a document are
generated independently
8
compute the class membership probabilities of each class
the class with the highest likelihood is predicted
9
*combining probability distributions
linear opinion pool
K: the number of experts : the probability assigned by expert
to word occurring in a document of class :the weights sum to one
10
logarithmic opinion pool
Z : a normalizing constant :weights satisfy restrictions that assure that
is a probability distribution
weights of individual experts
:the error of expert k on the training set
11
*A generative background-knowledge model
Definitions: – the vocabulary, i.e., set of words in our domain – set of positive terms from the lexicon that exists in V – set of negative terms from the lexicon that exists in V – set of unknown terms, i.e. – size of vocabulary, i.e. |V| – number of positive terms, i.e. |P| – number of negative terms, i.e. |N |
12
Property 1:
13
Property 2:
For this to be true for all values of α and β
14
Property 3: Since a positive document is more likely to contain a positive term than a negative term, and vice versa, we would like: (r as the polarity level)
Property 4: Since each component of our mixture model is a probability distribution, we have the following constraint on the conditional probabilities for each class :
15
16
Empirical Evaluation
Data sets Lotus blogs: IBM Lotus collaborative
software, 34 positive and 111 negative. Political candidate blogs: Posts focusing on
politics about Obama and Clinton, 49 positive and 58 negative.
Movie reviews: Apart from the blog data that we generated, we also used the publicly available data of movie reviews, 1000 positive and 1000 negative reviews
17
Result
18
19
20
21
22
Evaluating sensitivity of Pooling Multinomials to the polarity level parameter.
23
Conclusion
we develop an effective framework for incorporating lexical knowledge in supervised learning for text categorization.
we apply the developed approach to the task of sentiment classification — extending the state-of-the-art in the field which has focused primarily on using either background knowledge or supervised learning in isolation.
24
Future Works
There has been a flurry of recent work in the area of transfer learning that could be applied to extend a background knowledge-based model to incorporate data from different domains.
The fundamental challenge in such transfer learning is accounting for the training and test sets being from different distributions.