ReviewAnalysis MLconf 2016 JPrendki
-
Upload
jennifer-prendki -
Category
Data & Analytics
-
view
113 -
download
7
Transcript of ReviewAnalysis MLconf 2016 JPrendki
Review Analysis: An Approach to Leveraging User-‐Generated Content in
the Context of Retail
Jennifer Prendki, Principal Data ScientistWalmart Global e-‐Commerce
California, USA The Machine Learning Conference, San Francisco, CA11/11/2016
Outline
• Business motivation• Algorithm Pipeline• Feature Space Computation• Sentiment Capture• Real-‐Life Examples and Results• Future Work and Conclusions
2
Ø Business Motivation
Ø Algorithm Ø Feature Space
ComputationØ Sentiment
CaptureØ Real-‐Life
Examples Ø Future Work and
Conclusions
Business Motivation
3
Ø Business Motivation
Ø Algorithm Ø Feature Space
ComputationØ Sentiment
CaptureØ Real-‐Life
Examples Ø Future Work and
Conclusions
Business Motivation
4
Ø Business Motivation
Ø Algorithm Ø Feature Space
ComputationØ Sentiment
CaptureØ Real-‐Life
Examples Ø Future Work and
Conclusions
Business Motivation
5
Ø Business Motivation
Ø Algorithm Ø Feature Space
ComputationØ Sentiment
CaptureØ Real-‐Life
Examples Ø Future Work and
Conclusions
Business Motivation
6
Ø Business Motivation
Ø Algorithm Ø Feature Space
ComputationØ Sentiment
CaptureØ Real-‐Life
Examples Ø Future Work and
Conclusions
Business Motivation
“ I bought this for my daughter to do her college work on, it's been great, no problems so far. “
[SuperMom72]
“ Works like a charm, would definitely recommend to anyone
on a budget. “[Vamsy]
“ Fast CPU but slow disk drive slows everything down. ”
[TalonBay]
“ I don't do gaming or downloading movies or music, so for those folks I can't speak to the performance. But for surfing the web,
checking email, etc., this computer will save you time for watching the little ball spin!”
[Anonymous]
7
Ø Business Motivation
Ø Algorithm Ø Feature Space
ComputationØ Sentiment
CaptureØ Real-‐Life
Examples Ø Future Work and
Conclusions
Review Analysis: A Current Landscape
• Sentiment analysis• Best known use case: Social Media Analysis/Tweets
Why tweets? à shorter, condensed, highly sentimental content• Movie review analysis:
Kaggle: Analysis of the ‘Rotten Tomatoes’ Dataset
• Regarding product review analysis• Little to no papers regarding product review analysis at commercial scale • Shortage of work regarding combination of topic modeling and sentiment analysis
8
Our research: Combine feature computation and sentiment analysis to summarize reviewers’ opinions about a specific product
Ø Business Motivation
Ø Algorithm Ø Feature Space
ComputationØ Sentiment
CaptureØ Real-‐Life
Examples Ø Future Work and
Conclusions
Algorithm Pipeline
9
Product 𝛼
Product 𝛽
Product 𝛾
Review
Review
Review
Review
Review
Review
Fc
Category C
Feature Space
Computation
F𝛼
F𝛽
F𝛾
Feature Space
Reduction
Ø Business Motivation
Ø AlgorithmØ Feature Space
ComputationØ Sentiment
CaptureØ Real-‐Life
Examples Ø Future Work and
Conclusions
Algorithm Pipeline
10
Product 𝛼
Product 𝛽
Product 𝛾
Review
Review
Review
Review
Review
Review
Category C
F𝛼
F𝛽
F𝛾
Sentiment
Sentiment
Sentiment
Sentiment
Sentiment
Sentiment
SentimentComputation
For Each Review
… SentimentComputed For
Relevant FeaturesØ Business
MotivationØ AlgorithmØ Feature Space
ComputationØ Sentiment
CaptureØ Real-‐Life
Examples Ø Future Work and
Conclusions
Algorithm Pipeline
11
Product 𝛼
Product 𝛽
Product 𝛾
Review
Review
Review
Review
Review
Review
Category C
Sentiment
Sentiment
Sentiment
Sentiment
Sentiment
Sentiment
𝜎𝑡, 𝛼, 𝑓
𝜎𝑡, 𝛽, 𝑓
𝜎𝑡, 𝛾, 𝑓
∀ 𝑡 ∈ 𝜏
∀ 𝑓 ∈ F𝛼
∀ 𝑡 ∈ 𝜏
∀ 𝑓 ∈ Fβ
∀ 𝑡 ∈ 𝜏
∀ 𝑓 ∈ F𝛾
Ø Business Motivation
Ø AlgorithmØ Feature Space
ComputationØ Sentiment
CaptureØ Real-‐Life
Examples Ø Future Work and
Conclusions
SentimentComputation
For Each Review
… SentimentComputed For
Relevant Features
Feature Space Computation
• Textual reviews go through a careful process:• TF/TF-‐Idf transform on documents • Stop words removal, stemming, part-‐of-‐speech selection• Spell-‐checking• etc.
• ‘Synonym’ computation• Can be done using Word Embedding (glove, word2vec)• Can be done building synonym graph using dictionary/Wikipedia• Is complex and tricky, context-‐sensitive, unsupervised
12
In short: Creating synonym sets is difficult, and challenging as an online algorithm
In short: Preprocessing crucial to extracting relevant features
Ø Business Motivation
Ø AlgorithmØ Feature Space
ComputationØ Sentiment
CaptureØ Real-‐Life
Examples Ø Future Work and
Conclusions
[0-1]Intensity of negativity in
sentence
{'neg': 0.0, 'neu': 0.58, 'pos': 0.42, 'compound': 0.4404}
[0-1]Intensity
of neutrality in sentence
[0-1]Intensity
of positivity in sentence
[-1,1]Combination of positive and
negative sentiments.Allows positive and negative to
‘compensate’ one another
Sentiment Capture with Vader
VADER: Valence Aware Dictionary and sEntiment Reasoner • Is a Python sub-‐module found of the nltkmodule• Is a lexicon and rule-‐based sentiment analysis tool • Is specifically attuned to sentiments expressed in social media• Is fully open-‐sourced, developed and licensed by MIT
13
Sentiment is not booleanposneg neu
Sentiment as a PDF
Ø Business Motivation
Ø Algorithm Ø Feature Space
ComputationØ Sentiment
CaptureØ Real-‐Life
Examples Ø Future Work and
Conclusions
Sentiment Capture with Vader
14
”This computer is good deal.” “This computer is a bad deal.”
pos 0.42 0.0neu 0.58 0.533neg 0.0 0.467compound 0.4404 -0.5423
”This computer isnot powerful.”
“This computer is not that powerful.”
“This computer is not powerful, but I like it anyways.”
“This computer is not that powerful, but I like it anyways.”
pos 0.0 0.0 0.0 0.252neu 0.632 0.682 0.618 0.619neg 0.368 0.318 0.382 0.129compound -0.3252 -0.3252 -0.5157 0.3786
à Vader is sensitive to adverbs, punctuation, case, emoticons and nuances…
Ø Business Motivation
Ø Algorithm Ø Feature Space
ComputationØ Sentiment
CaptureØ Real-‐Life
Examples Ø Future Work and
Conclusions
Sentiment Capture with Vader
15
Product A
“The design and picture quality are amazing!”
I love it! Just perfect for people on a budget. And it is beautifully designed!!
“Pretty good, but I am not a fan of the design.”
“I don’t think it’s possible to find better for the price”
design
Product B
“I just HATE the design!!”
“Okay computer. Wish I read the other reviews first.”
design
picture quality
picture quality
+ 0.39 + 0.39
+ 0.56
~ 0.48
-‐ 0.62
Ø Business Motivation
Ø Algorithm Ø Feature Space
ComputationØ Sentiment
CaptureØ Real-‐Life
Examples Ø Future Work and
Conclusions
Sentiment Capture with Vader
16
Product A
design
Product B
design
picture quality
picture quality
designpicture quality
batteryvalue
designbattery
CPU
processor
0.48
0.39
0.46 0.62
0.49 NA
NA NA
0.75
0.87NANA
0.43 NA NA
3
1
4
2
1
0.39
0.60
NA
NA
1
1
1
NA
NA
NA
NA
0.62
✍Scraping Summarizing Sentiment Intensity
+ 0.39
+ 0.56
~ 0.49
-‐ 0.62
+ 0.39Ø Business Motivation
Ø Algorithm Ø Feature Space
ComputationØ Sentiment
CaptureØ Real-‐Life
Examples Ø Future Work and
Conclusions
Results Discussion: Real Life Example
17
Product BProduct A
Ø Business Motivation
Ø Algorithm Ø Feature Space
ComputationØ Sentiment
CaptureØ Real-‐Life
Examples Ø Future Work and
Conclusions
Results Discussion: Real Life Example
18
screen
desig
n quality
performance
NEG NEU POSscreen
desig
n
performance
quality
screen
desig
n quality
performance
Some dissatisfaction with overall quality
Reviewers are rather happy with keyboard
Weight is better for product A than for product B
Customers satisfied with keyboard, display, screen, design, …Product B’s weakness is battery life The product’s features
are well documented
Product A
Product B
Ø Business Motivation
Ø Algorithm Ø Feature Space
ComputationØ Sentiment
CaptureØ Real-‐Life
Examples Ø Future Work and
Conclusions
Results Discussion: Real Life Example
19
This laptop exceeds my expectations. It's fast, it's powerful, it's compact and great to travel with.
“The screen is amazing and the keyboard too. the weight is so light, it's become my portable.”
It’s durable, good keyboard, decent screen, and a good battery life.
Plasticky build quality but holds up with my rough and tough handling. Is is surprisingly light. Keyboard is the best but it tales a bit of getting used to[…]
Very light to carry and the carbon color gives an elegant finishing touch. Ø Business
MotivationØ Algorithm Ø Feature Space
ComputationØ Sentiment
CaptureØ Real-‐Life
Examples Ø Future Work and
Conclusions
Product A
11 reviews, ~ 53 words per review
Results Discussion: Real Life Example
The design is what caught my eye. Everything about this laptop is okay, except the battery life.
Overall a great laptop with good display and build quality, solid performance and sleek design the only major concern is battery life.
The touch screen is absolutely first rate […], and the back-lit keyboard has just the right feel.
This is the best computer I've ever owned. […]. I love the backlit keyboard, the easily adjustable resolution and the long battery life.
Pros: great screen, keyboard feels nice, best touchpad, very fast, extremely light, built durable Cons: battery life is less than competitors […].
It's really light weight yet really durable. I love the keyboard and mouse pad.
28
Ø Business Motivation
Ø Algorithm Ø Feature Space
ComputationØ Sentiment
CaptureØ Real-‐Life
Examples Ø Future Work and
Conclusions
Product B
40 reviews, ~ 83 words per review
Conclusion and Future Work
Work in Progress• Synonym computation: work in progress• Observed bias in sentiment, needs particular attention• Alternative when no/little reviews exist?
Potential future applications• Offer a snapshot of product reviews to customers• Assist customers in finding similar items with enhanced feature(s)• Process seller satisfaction information/rating• Customer email processing, determine subject of request automatically
21
Ø Business Motivation
Ø Algorithm Ø Feature Space
ComputationØ Sentiment
CaptureØ Real-‐Life
Examples Ø Future Work and
Conclusions
References
[1] Gensimhttps://radimrehurek.com/gensim/models/word2vec.html
[2] GloVehttp://nlp.stanford.edu/projects/glove/
[3] Wordnethttps://wordnet.princeton.edu/
[4] nltk.stemhttp://www.nltk.org/api/nltk.stem.html
[5] nltk.vaderPaper: VADER: A Parsimonious Rule-‐based Model for Sentiment Analysis of Social Media Text, C.J. Hutto, Eric GilbertCode: http://www.nltk.org/_modules/nltk/sentiment/vader.html
22
Ø Business Motivation
Ø Algorithm Ø Feature Space
ComputationØ Sentiment
CaptureØ Real-‐Life
Examples Ø Future Work and
Conclusions
Questions?
23
Back-‐Up Slides
Sentiment Capture with Vader
25
>>> sentence1 = ”This computer is a good deal."{'neg': 0.0, 'neu': 0.58, 'pos': 0.42, 'compound': 0.4404}
>>> sentence2 = “This computer is a very good deal.”{'neg': 0.0, 'neu': 0.61, 'pos': 0.39, 'compound': 0.4927}
>>> sentence3 = “This computer is a very good deal!!”{'neg': 0.0, 'neu': 0.57, 'pos': 0.43, 'compound': 0.5827}
>>> sentence4 = “This computer is a very good deal!! :-‐)”{'neg': 0.0, 'neu': 0.441, 'pos': 0.559, 'compound': 0.7462}
>>> sentence5 = “This computer is a VERY good deal!! :-‐)”{'neg': 0.0, 'neu': 0.393, 'pos': 0.607, 'compound': 0.8287}
>>> sentence6 = “This computer is a very bad deal!! :-‐(”{'neg': 0.588, 'neu': 0.412, 'pos': 0.0, 'compound': -0.7987}
Adverb addition
Punctuation addition
Emoticon addition
Case enhancement
Inverse polarity
Ø Business Motivation
Ø Algorithm Ø Feature Space
ComputationØ Sentiment
CaptureØ Real-‐Life
Examples Ø Future Work and
Conclusions
Ratings vs Sentiment Analysis
26
Ratings (number of stars)
Average sentiment from text review
negativeneutral
positive
Ø Business Motivation
Ø Algorithm Ø Feature Space
ComputationØ Sentiment
CaptureØ Real-‐Life
Examples Ø Future Work and
Conclusions
Ratings vs Sentiment Analysis
27
Good reviews
Bad reviews
user bias = 𝑛 𝑝𝑜𝑠 − 𝑛(𝑛𝑒𝑔)𝑛 𝑝𝑜𝑠 + 𝑛(𝑛𝑒𝑔)
where: pos = number of prior good reviewsneg = number of prior bad reviews
Ø Business Motivation
Ø Algorithm Ø Feature Space
ComputationØ Sentiment
CaptureØ Real-‐Life
Examples Ø Future Work and
Conclusions
Review Bias
28
• Where is subjectivity coming from?• Language bias / gender bias / etc.• Vader package biases due to development specificity? (remember: originally developed for social media)
• Incentivized customers/reviewers
• Why is it important to correct for it?• Filtering/sorting with ratings doesn’t work as well as expected
• Possible options• Filter reviews with large bias• Weight results• Re-‐center the output of Vader to fit our definition of ’neutrality’
In short: Biases in both ratings and textual sentiment, both need attention
Ø Business Motivation
Ø Algorithm Ø Feature Space
ComputationØ Sentiment
CaptureØ Real-‐Life
Examples Ø Future Work and
Conclusions