Opinion Mining Using Econometrics: A Case Study on Reputation Systems Anindya Ghose, Panagiotis G....
-
Upload
austen-andrews -
Category
Documents
-
view
218 -
download
0
Transcript of Opinion Mining Using Econometrics: A Case Study on Reputation Systems Anindya Ghose, Panagiotis G....
Opinion Mining Using Econometrics: A Case Study on
Reputation Systems
Anindya Ghose, Panagiotis G. Ipeirotis, and Arun Sundararajan
Leonard N. Stern School of Business, New York University
ACL 2007
Questions/Challenges
• What makes an opinion positive or negative? Is there an objective measure for this task?
• How can we rank opinions according to their strength? Can we define an objective measure for ranking opinions?
• How does the context change the polarity and strength of an opinion and how can we take the context into consideration?
Introduction
• Reputation systems in electronic markets
• Pricing power of merchants in Amazon.com
• Using 9,500 transactions over 180 days
• Textual feedback and star rank
• Discover polarity and strength without the need for human annotations or linguistic resources.
Arguments
• Textual feedback affects the power of merchants to charge higher prices than the competition for the same product and still make a sale.
Reputation Systems
• A reputation profile– Past transactions for the merchant.– Numerical ratings from buyers who have
completed transactions.– Chronological list of textual feedback provided
by these buyers.
Data
• Transaction Data:– 1,078 merchants– 9,487 unique transactions– 107,922 price premiums
• Reputation Data:– 4,932 postings per merchant– Numerical ratings: one to five stars– Reconstruct each seller’s exact feedback profile
at the time of each transaction
Econometrics-based Opinion Mining
• Retrieving the dimensions of reputation– Features expressed by noun, noun phrase, verb,
verb phrase.– For example,
– X1 might be shipping, X2 might be packaging.
Reputation dimension example
• X=(delivery, packaging, service)
• I was impressed by the speedy delivery! Great service! (post 1)
• The item arrived in awful packaging and the delivery was slow. (post 2)
Scoring the dimension of reputation
• Construct an n x p matrix M(si)• A total of 151 unique dimensions, and a tota
l of 142 modifiers.• c is the prob of clicking on the “Next” link.• K is the number of postings that appear on e
ach page.• Posting–specific
weight
Posting – specific weight example
• Weight is dropped exponentially if the page number is increased.
• If K = 25, total post = 51, weight of post 1 (page 3)= 1/(25+25c+c2) weight of post 26 (page 2) = c/(25+25c+c2) weight of post 51 (page 1)= c2/(25+25c+c2)
• Score of modifier-dimention (feature) pair:
Reputation Score
• Modifier-dimention pair score = strength• Feature weight = polarity• A correlation between the appearance of modifier-d
imension opinion phrase ( ) of the merchants and the price premiums observed for each transaction.
Scoring by regression
• Regressor coefficients• Control variables• Fixed effects• The error term• Score: counting appearances and weighting each a
ppearance using the definition of ri
• Variations:
Regression Settings
• Predicting
• Control variables:– The product’s price on Amazon– The average star rating of the merchant– The number of merchant’s past transactions– The number of sellers for the product
Experimental Results• Human Recall by two annotators, a random
sample of 1,000 posts:• Computer Recall: average over two annotators• Precision is not an issue, noise will be ruled out
by regression
Price Premiums vs. Ratings
• Many researches assume that text feedback will not influence the buyers. They used only rating stars as a summary of opinions.
• Examine R2 fit of the regression, with and without the use of the text variables. Without: R2 = 0.35; With: R2 = 0.63
• Text contains significantly more information than the ratings.
Prediction
• Predict which merchant will make a sale.
• C4.5 classifier, 4 months data as training and 2 months data as testing.
Possible application
• Exam the effect of product reviews on product sales and detect the weight that customers put on different product features.
• The analysis of the effect of news stories on stock prices; how opinion holders and their wordings can cause the market to move up or down.
• Extract the pragmatic effect of news and blogs on elections or other political events.