AMANDA COHEN MOSTAFAVI Applying Entity Discovery and Assignment to video games in order to mine...

19
AMANDA COHEN MOSTAFAVI Applying Entity Discovery and Assignment to video games in order to mine opinions

Transcript of AMANDA COHEN MOSTAFAVI Applying Entity Discovery and Assignment to video games in order to mine...

Page 1: AMANDA COHEN MOSTAFAVI Applying Entity Discovery and Assignment to video games in order to mine opinions.

AMANDA COHEN MOSTAFAVI

Applying Entity Discovery and Assignment to video games in

order to mine opinions

Page 2: AMANDA COHEN MOSTAFAVI Applying Entity Discovery and Assignment to video games in order to mine opinions.

Project Purpose

• Many differing opinions on a video game’s quality, difficult to determine general opinion

• Usually look to professional video game reviews • Can gather review scores, normalize and

average score in order to determine general consensus– Done on GameRankings.com

• However, this ignores the discussion by everyday players– Debate takes place most commonly on message boards

Page 3: AMANDA COHEN MOSTAFAVI Applying Entity Discovery and Assignment to video games in order to mine opinions.

Project Purpose

Solution: Mine opinions expressed on message board posts and derive a consensus from the results

Using the algorithm for entity discovery and assignment and opinion mining as defined in this paper: Entity Discovery and Assignment for Opinion Mining

Applications. Xiaowen Ding, Bing Liu, Lei Zhang. SIGKDD, 2009

Page 4: AMANDA COHEN MOSTAFAVI Applying Entity Discovery and Assignment to video games in order to mine opinions.

Goal:

To mine opinions on selected games expressed on video game message boards, derive an average opinion and compare results to the review scores gathered by GameRankings.com

Page 5: AMANDA COHEN MOSTAFAVI Applying Entity Discovery and Assignment to video games in order to mine opinions.

Games

• Total games examined: 10• All released in 2007• 5 were top-selling games of the year,

according to the NPD group (market research group that studies the video game industry, among other things)

• 5 are among the highest reviewed games according to GameRankings.com

• Ensures a mix of critically and commercially successful

• Note: Duplicate Games are removed

Page 6: AMANDA COHEN MOSTAFAVI Applying Entity Discovery and Assignment to video games in order to mine opinions.

Games

• High selling– Halo 3 (360, Microsoft) -

4.82 million– Wii Play with Wii Remote

(Wii, Nintendo) - 4.12 million

– Call of Duty 4: Modern Warfare (360, Activision) - 3.04 million

– Guitar Hero III: Legends of Rock (PS2, Activision) - 2.72 million

– Super Mario Galaxy (Wii, Nintendo) - 2.52 million

• Highly Reviewed– The Orange Box (PC,

Xbox 360) – 96%– BioShock (PC, Xbox 360)

– 94%– Elder Scrolls IV: Oblivion

(PS3) - 92%– God of War II (PS2) –

92%– Team Fortress 2 (PC) –

92%

Page 7: AMANDA COHEN MOSTAFAVI Applying Entity Discovery and Assignment to video games in order to mine opinions.

Game Issues

Alternate Names: Games are often referenced by shorthand or abbreviation

Solution: include an array of possible alternate names in defining the entity object

Page 8: AMANDA COHEN MOSTAFAVI Applying Entity Discovery and Assignment to video games in order to mine opinions.

Message Boards

Principally from video game websites, or websites with large portions devoted to video games

Looking at comments in relation to articles about top selling games or reviews to ensure that the posts are relevant to the games Lots of comparative statements as well

Page 9: AMANDA COHEN MOSTAFAVI Applying Entity Discovery and Assignment to video games in order to mine opinions.

Message Board Posts

1UP.com: 26 postsGamespot.com: 14 postsIGN.com: 20 postsTotal: 60 posts

Page 10: AMANDA COHEN MOSTAFAVI Applying Entity Discovery and Assignment to video games in order to mine opinions.

Post Issues

Unusual ways of expressing opinions: message board posters may not express their opinions in the same way as someone writing a review would. For instance: “Call of Duty 4 was a very good game” <- this

sentence would make for very easy opinion mining “COD4 IS TEH WIN, OMG!!!!111” <- more likely on a

message board, and much harder to mine

Solution: The opinion mining algorithm allows for “opinion grammar”. More later…

Page 11: AMANDA COHEN MOSTAFAVI Applying Entity Discovery and Assignment to video games in order to mine opinions.

The Process

Implements Entity Discovery and assignment algorithm, with a couple modifications: Entity discovery section reduced to better fit purposes

of the project Ordinarily would use pattern mining in order to find

entities, not an issue in this case since there are a predetermined set of games examined

Page 12: AMANDA COHEN MOSTAFAVI Applying Entity Discovery and Assignment to video games in order to mine opinions.

Data preprocessing

Each word in every post is given a part-of-speech tag Designates the grammatical role of each word Done with Stanford’s POS tagger, developed by the

Stanford Natural Language Processing group http://nlp.stanford.edu/software/tagger.shtml

A list of the entities used are created, and their alternate names are define

Page 13: AMANDA COHEN MOSTAFAVI Applying Entity Discovery and Assignment to video games in order to mine opinions.

Entity Discovery and Assignment

Each post is parsed to separate sentences and find each entity

If entity is found, and matches the game title, the entity is assigned to that sentence

If there is no entity, the entity of the previous sentence is assigned Works on the assumption that when someone starts talking

about an entity, subsequent sentences deal with the same entity without explicitly stating it

If an alternate name for the entity is found, it is replaced with the original title to reduce future processing time

Page 14: AMANDA COHEN MOSTAFAVI Applying Entity Discovery and Assignment to video games in order to mine opinions.

Opinion Grammar

The original authors suggest that hard-coding every possible opinion words is not recommended

Instead, they suggest using a system to define grammar that will pick out opinion words and statements

A combination of hard-coded word list and grammar rules were used for this project Hard coded words for regular English grammar,

defined rules for more unexpected words and phrases

Page 15: AMANDA COHEN MOSTAFAVI Applying Entity Discovery and Assignment to video games in order to mine opinions.

Indicator Word Symbols

Po: PositiveNe: NegativeNeu: NeutralNg: NegationBut: But-like

Page 16: AMANDA COHEN MOSTAFAVI Applying Entity Discovery and Assignment to video games in order to mine opinions.

Opinion Mining

Step 1: Apply indicator word symbolsStep 2: Apply phrase rulesStep 3: Search for negations, and change the

opinion of the subsequent word (if it was positive, it would be negative and vice versa)

Step 4: Aggregate opinions Search for indicators, Po = 1, Ne = -1, Neu = 0

Page 17: AMANDA COHEN MOSTAFAVI Applying Entity Discovery and Assignment to video games in order to mine opinions.

Comparative Sentences

If a sentence has more than one entity, it is a comparative sentence This sentence compares one entity to another, i.e.

“Game-A is better than Game-B”

In order to find the superior and inferior entities, look for comparative or superlative words (according to POS tags) and whether it’s a positive or negative word If negative, the entity after the comparative word is

superior. If positive the entity before the comparative word is superior

Page 18: AMANDA COHEN MOSTAFAVI Applying Entity Discovery and Assignment to video games in order to mine opinions.
Page 19: AMANDA COHEN MOSTAFAVI Applying Entity Discovery and Assignment to video games in order to mine opinions.

Up next: demo and results…