Weiwei [email protected] Need-based Product Review Mining 1.
-
Upload
ashley-collins -
Category
Documents
-
view
219 -
download
3
Transcript of Weiwei [email protected] Need-based Product Review Mining 1.
Outline
Introduction: Traditional Product Review Mining Change to “Need-based Product Review Mining”
Research AreaTechnology Related
Need Recognition Feature(explicit & implicit) Extraction Opinion Extraction Scoring and Ranking
Conclusion
2
Introduction3
Traditional Product Review Mining
Product-centric(Product-based): Process:
Select a product Review mining Structural visualization
Paper: Liu.B[KDD04, WWW05], Dave.K[WWW03],
Turney.P[ACL02], Liu[KDD08] etc. An example:
CRO
4
An example: CRO
[1]Select a product(or input a product)
5
An example: CRO
[2]Review Mining of the corresponding product
6
An example: CRO
[3]Structural visualization
7
Change to “Need-based Mining”
Motivation – “Online Purchasing Analyze” “Customer seek to satisfy a particular need”[Kotler03] Vs “Traditional Store-Purchasing”
A clerk to help you(Store-purchasing) Using online chat software to interact customers(Online-
purchasing) What are they talking about?
Help to translate their “need” to a specific product
8
Change to “Need-based Mining”
Motivation Why doing this?-”why not let customer do this
alone?” Don’t know what the product attributes mean Only have a need in mind Need a recommended products list satisfying their need
How to translate need is a problem
9
Need-based Mining
Need-based(or user-centric) Focus on multi-products of a product category(not a
single product) Associate “need” to a set of attributes of the product Recommend products by sentiment analysis towards
the attributes above
10
Research Area11
Research Framework
CSS Research Framework
Customer
Product Review
Product
Traditional Product Review Mining Research
Framework
Need
Product Review
Customer
Product
12
Review DB
a need
Feature extraction
Merge similar feature
Onto construction
Need recognition
Opinion extraction
<Feature, opinion>pairs, include
implicit feature identify
a rank list of product
Online
Offline
Research Framework
Aggregation function
Sentiment analysis
Product scoring13
1. Need Recognition2. Feature Extraction3. Opinion Extraction(sentiment analysis)4. Scoring and Ranking
Technology related14
1.Need Recognition
Need definition: “Feeling of want that provides a basis for behavior or
action”(name) These words are implicitly related to a set of features
of a product category Each feature has a weight Need = <n, F, W> Some examples:
“a camera for climbing”: <“Climbing”, {size, weight, wide-angel}, {0.3, 0.5, 0.2}>
“a sun-resistant cosmetic”: <“sun-resistant”, {whitening, price}, {0.8, 0.2}>
15
Need Recognition16
Need name
Feature clusters
1
2
3
4
Degree of association (DOA) calculation
-PMI, LSA, etc.
Introduction to PMI17
Fact object and discriminator object Based on co-occurrence of words PMI(f, d) = PMI=0, independent; PMI>0, dependent
Estimation-”PMI-IR[21]” Constraint: Near, And, etc[23 AltaVista].
)()(
),(lg
dPfP
dfP
)()(
int),(log),(
dhitsfhits
ConstradfhitsdfIRPMI
DOA Calculation18
How to find and quantify the association between two objects?
Product Reviews
corpus(Full set)
Ideal Condition )()(
),(lg
BPAP
BAPPMI
DOA Calculation19
Reviews corpus
Actual Condition + PMI-IR[21] like algorithm
Need Recognition20
Need name
Feature clusters
1
2
3
4
Find the features set F and weight
F->{F1, F3}W ->{W1, W3}
Feature Set and Weight21
Feature choosing condition: Degree of association(DOA) ≧ δ(threshold)
Set’s Weight Calculation:
Need Description: Need = <name, F, W>
F
FFj
ii
j
nFDOA
nFDOAW
),(
),(
2.Feature Extraction(Onto Cons)22
Related Work: Supervised method. Unsupervised method.
Disadvantages: Similar features clustering problem(Concept
relationship discovery) Implicit features recognition problem
Feature Extraction23
What is feature(>attribute)? Not only the product parameters(attribute) All the comment aspects of the product “Official parameter specification” + “consumer
comment aspects” Features are infinite Explicit feature and Implicit feature
Feature Extraction(Explicit Feature)[17]24
Relevant Product Review
corpus
Irrelevant Product Review
corpus
Candidate Feature Set
Feature Set Noisy filter
Similar feature clustering
Supplementary
Similar Feature Clustering25
Related works: [14], [15], [18]-” Reinforcement Clustering
heterogeneous web objects”
First problem: How to pre-define the similar feature? Synonym features, the same aspects of the product
Experiments: Content only to cluster similar features Link only to cluster similar features Content plus link to cluster similar features
Experiments 26
featuresOpinions
Content-based method27
Similarity calculation:
Clustering PAM
Measurement: Entropy, Precision, Recall, F-Measure.
w)Cooccur(f,
wfCountOpwfSubstrwfSim
),(),(),(
Link-based method28
Similarity calculation: SimRank[19]
Clustering: PAM
Measurement: Entropy, Precision, Recall, F-Measure.
Content plus link method29
featuresOpinions
Content plus link method30
featuresOpinions
Content plus link method31
featuresOpinions
Content plus link method32
featuresOpinions
Content plus link method33
featuresOpinions
Content plus link method34
featuresOpinions
Experiments Results35
Link method
Content-based
Content plus link
Entropy 5.9891 3.4160 2.4037
Precision 0.6443 0.6860 0.7308
Recall 0.5104 0.6895 0.7312
F-measure 0.5688 0.6872 0.7310
Feature Extraction(Implicit Feature)36
Opinions
featuresConfident Value:
Opinion(w)->F(i)
Feature Relationship Learning
同义 上下位部分 /整体
3.Opinion Extraction(<Sentiment Analysis)
38
Sentiment Analysis Sub/obj text classification, sentiment tracking, product
opinion mining, etc.
Opinion Extraction Context-based opinion polarity identification
What is opinion?39
Opinion Words or phrases express semantic
orientation(Positive, Negative or Neutral) Context independent opinion(“good”, “bad”, etc) Context dependent opinion(“big”, “small”, etc)
Opinion semantic orientation identification Context independent opinion Context dependent opinion
Related Works40
Context independent opinion WordNet-based method [1, 2, 5]
Seed list, Incremental PMI-SO method [Turney 24]
Seed list(“excellent”, “awful”, etc)
Context dependent opinion Syntactic rules(conjunction, disjunction, etc)
[Ding 20] Semantic Clustering based
[Liu 5], [Yang “Study of Structurizing Chinese Product Review”]
Problems41
Find the context of opinion word Word level
Eg: “good”, “bad”, etc. (Context independent opinion) <feature, opinion> pair level
Eg: “The camera is too heavy”, <camera, heavy>-negative
Sentence level Eg: “The camera is very shining but I don’t like it.” Almost all the research don’t consider this problem Split by “but”, <camera, shining>-positive (Actually is
negative here)
Future works42
Try to tackle these problem, especially 3.
4.Product Scoring and Ranking43
Related Work: Product Recommendation based on reviews
[9], [12], etc.
Problem: Only consider one feature at a time[12 Red Opal]
A need always has several features All the reviews are equal[all]
Different reviews express different need Only consider numerical scores(always total scores)[3,
4, 12] Maybe in a review fa‘s polarity is negative, fb’s polarity is
positive, but the reviewer gives the score is 3 star
Product Scoring and Ranking44
Need-based Product Recommendation Focus on multi-features at a time Weight each review by their satisfactory of the giving
need Topic-based opinion extraction
Need = <n, F, W> n: a word or phrase reveal the consumer need F: feature set W: weight of each feature
Product Scoring and Ranking45
Product Scoring
Product Ranking Product scores, NA(need association), etc.
n
niii SONAScoreP
],1[
*.
Fi
iFi
Nn
NNA
*
2
F
Fffji
j
jSOwSO *
Reference
[1] Liu.B. Opinion Observer: Analyzing and Comparing Opinions on the Web. WWW05
[2] Liu.B. Mining and Summarizing Customer Reviews. KDD04[3] Turney.P. Thumbs Up or Thumbs Down?: Semantic Orientation
Applied to Unsupervised Classification of Reviews. ACL02[4] Dave.k. Mining the Peanut Gallery: Opinion Extraction and
Semantic Classification of Product Reviews. WWW03[5] Liu. CRO: a system for online review structurization. KDD08[6] Kotler.P. Marketing Management. Prentice Hall 2003.[7] Orman.L. Consumer Support System. Communications of ACM
2007[8] Lee.T. Need-based Analysis of Online Customer Reviews. ICEC07[9] Lee.T. Needs-Centric Searching and Ranking Based on Customer
Reviews. ICEC08[10] Lee.T. Use-centric mining of customer reviews. WITS04[11] Lee.T. Constraint-based Ontology Induction from Online
Customer Reviews. Group Decision and Negotiation
46
Reference
[12] Scaffidi.C. Red Opal: Product-Feature Scoring from Reviews. ACM-EC 07
[13] Scaffidi.C. Application of a Probability-based Algorithm to Extracting of Product Features from Online Reviews. CMU Technical Report 06
[14] H. J Zeng. A unified framework for clustering heterogeneous web objects. ICWISE 02.
[15] Q.Su. Hidden Sentiment Association in Chinese Web Opinion Mining. WWW 08
[16] X.Y Du. A Survey on Ontology Learning Research. Journal of Software 06
[17] W.Wei. Extracting Feature and Opinion Words Effectively from Chinese Product Review. FSKD 08
[18] J.D Wang. ReCoM: Reinforcement Clustering of Multi-Type Interrelated Data Objects. SIGIR 03
[19] G.J. SimRank: A measure of Structural-Context Similarity. SIGKDD 02
[20] X.W Ding. A Holistic Lexicon-Based Appraoch to Opinion Mining. WSDM 08
47
48
[21] P.Turney. Mining the Web for Synonyms: PMI-IR Versus LSA on TOEFL. ECML 01
[22] Manning.C. Foundations of Statistical Natural Language Processing. MIT Press 1999
[23] AltaVista: AltaVista Advanced Search Cheat Sheet. Alta Vista Company 01
[24] A.Maria. Extracting Product Features and Opinions from Reviews. EMNLP 05
49
End.
Any question?