Detecting and analysing emotion in social network sites Mike Thelwall Statistical Cybermetrics...
-
Upload
cody-jacobs -
Category
Documents
-
view
219 -
download
3
Transcript of Detecting and analysing emotion in social network sites Mike Thelwall Statistical Cybermetrics...
detecting and analysing emotion in social network sites
Mike ThelwallStatistical Cybermetrics Research GroupUniversity of Wolverhampton, UK
Virtual Knowledge Studio (VKS)
Information Studies
MySpace commentscase study
research motivation
sentiment is a frequently overlooked key factor in communication and relationshipsneeds to be investigated to understand the role of sentiment in new online environments identify suicide “at risk” discover emotional factors necessary for
sustained online environments modify bots to detect and react appropriately
to emotional communication
talk structure
part 1: background information about MySpace comment communication
part 2: automatically detecting sentiment in MySpace comments
MySpace comments arepublic or semi-public short messagesexchanged by Friends
but what is their purposeand what do they look like?
comments
Displayed in public on home page – public personal messages
purpose 1: gossip (53% of dialogs) – examples of gossip comments
I moved to Houston, Tx.I come home at the beginning of Julywell i just diyed my hair nearly black!!i regret not going to UMSX bc MZU is so much harderi sooo messed up :((for a white guy tim knows a lot of rap songTina talks about you all the time.Nigel said you were feeling bad
purpose 2: coordination of offline activities (18% of dialogs)
CALL ME WHEN YOU GET A CHANCE hey text me sometime.. [number]i hope to see you toniiite <3I'm gonna be in ABD in Jan. for like a week, we gotta hang outHey I can call you 2day?!!
purpose 3: keeping in contact
emotion in MySpace
how important is emotion expression in social network communication?who uses emotion and what type of emotion?
emotion in Friend comments
most comments contain positive emotion (including formal expressions, such as “Love, Sue” or “raj x”)few contain negative emotion
Emotion +ve -ve
1 (none) 34% 80%
2 28% 6%
3 35% 11%
4 3% 2%
5 (strong) 0% 1%
Emotion strength in 819random comments
emotion in Friend comments
positive emotion mainly used by females and mainly directed at femalesno gender difference in negative emotions
Fromfemale
Frommale
Tofemale
2.4 (+)
1.3 (-)
2.0 (+)
1.3 (-)
Tomale
2.2 (+)
1.3 (-)
1.7 (+)
1.5 (-)
Average emotion strength in 819 random comments
CYBEREMOTIONS = data gathering + complex systems methods + ICT outputs
To identify and analyseCollective Emotionsin Cyberspace
Sentistrength
problem 1: non-standard English in MySpace comments
Aspect of non-standard EnglishComm
ents
Typographic slang or abbreviations (e.g., omg, lol, hugz, @)
41%
Slang, including dialect, swearing, and idiomatic slang sayings
51%
Non-standard spelling other than the above 33%
Non-standard punctuation 81%
Pictograms 16%
Interjections (e.g., haha, muahh, huh, but not oh). 13%
Non-standard capitalisation 75%
Other non-standard English grammar 56%
Not standard formal written English (i.e., Any of the above)
97%
common words in commentsRank Word
1-10 i, you, to, the, and, a, u, me, hey, my
11-20 it, for, in, love, is, that, so, up, your, on
21-30 have, of, are, just, lol, but, we, how, be, ya
31-40 at, was, well, what, get, like, good, im, know, out
41-50 been, this, with, see, hope, all, do, not, if, happy
51-60 miss, going, go, time, i'm, ur, back, some, got, there
61-70 when, can, will, thanks, its, or, by, from, now, whats
71-80 say, day, new, hi, much, one, no, about, haha, call
81-90 come, :), soon, too, need, birthday, 2, am, had, here
91-100 dont, doing, as, think, man, page, great, did, weekend, work
Bold words are not in the top 100 for general British English, and italic words are not in the top 100 for general American English.
problem 2: swearing
rife in MySpaceconveys positive and negative emotionsignored by existing sentiment analysis methods
emphatic adverb/adjective OR adverbial booster OR premodifying intensifying negative adjective (36% of swearing)
and we r guna go to town again n make a ryt fuckin nyt of it again lol
see look i'm fucking commenting u back lol and stop fucking tickleing me!! Thanks for the party last night it was
fucking good and you are great hosts. That 50's rock and roll weekender was
fucking mint! yeah so me and sarah broke up and
everythings fucking shit
personal insult referring to defined entity (28% of swearing)
tehe i am sorry.. i m such a sleep deprived twat alot of the time! lolMaxy is the soundest cunt in the world!!!!3rd? i thought i was your main man number one? Fucker write bak cunt xxx You evil cunt! Haha lucky fuck
idiomatic set phrase OR figurative extension of literal meaning (23%, mostly male)
think am gonna get him an album or summet fuck nows got another copy of the reaction CD (will had fucked the last one lol) qu'est ce que fuck? what the fuck pubehead whos pete and why is this necicery mate Heh long story.. cant be fucked to explain :D
SentiStrength objective
1. detect positive and negative emotion in MySpace comments
2. develop workarounds for lack of grammar and spelling
3. harness emotion expression forms unique to MySpace or CMC (e.g., :-) or haaappppyyy!!!)
4. classify each MySpace comment as positive 1-5 AND negative 1-5
5. apply to social issues
SentiStrength algorithm
spelling correction for repeated letters Helllllo -> Hello (emphasis: llll)
list of +ve and -ve words with strengths (party from LIWC; includes swearing) hate=-4, love =3
extra heuristics emphasis acts to enhance + or – emotion emotion words ignored in questions take strongest +ve & -ve expression in whole
comment booster words (e.g., very, some)
http://sentistrength.wlv.ac.uk/
sentiment strength estimation example
HEEEEEEEEY BUDDY!!!!!!!!
HEY BUDDY!
HEY BUDDY!
+1 +1
overall – positive: 3, negative 1
1 +1=2
word +ve
hey 1
buddy
2
2 +1=3
translation and extraction of emphasis
Look up words in Sentiment strengthdictionary
SentiStrength vs. std. classifiers
Algorithm Positive Negative
SentiStrength 60.9% 73.0%
Support Vector Machines 56.2% 73.6%
Simple logistic regression
55.0% 72.8%
J48 classification tree 54.9% 72.6%
Naïve Bayes 54.9% 67.3%
Decision table 54.8% 73.8%
JRip rule-based classifier 54.1% 73.1%
Multilayer Perceptron 49.6% 71.4%
Baseline 41.6% 71.2%
Random 20.0% 20.0%
10-foldcross-validationon 1041human-classifiedcomments
application - evidence of emotion homophily in MySpace
automatic analysis of sentiment in 2 million comments exchanged between MySpace friends correlation of 0.227 for +ve emotion strength and 0.254 for –vepeople tend to use similar but not identical levels of emotion to their friends in messages
conclusions
social network sites are a source of sentiment expressed in very informal languagecan identify positive and negative sentiment with reasonable accuracyapplications: identifying social trends Identifying potential emotional
“anomalies”
bibliographyThelwall, M., Buckley, K., Paltoglou, G., Cai, D. & Kappas, A. (under review). Sentiment strength detection in short informal text. Thelwall, M., Wilkinson, D. & Uppal, S. (2010). Data mining emotion in social network communication: Gender differences in MySpace, Journal of the American Society for Information Science and Technology, 61(1), 190-199. Thelwall, M. (2008). Fk yea I swear: Cursing and gender in a corpus of MySpace pages, Corpora, 3(1), 83-107.Thelwall, M. (2009). Homophily in MySpace, Journal of the American Society for Information Science and Technology. 60(2), 219-231.Thelwall, M. (2009). Social network sites: Users and uses. In: M. Zelkowitz (Ed.), Advances in Computers 76. Amsterdam: Elsevier (pp. 19-73). Thelwall, M. & Wilkinson, D. (2010). Public dialogs in social network sites: What is their purpose?, Journal of the American Society for Information Science and Technology, 61(2), 392-404http://www.cyberemotions.eu/snic.ppt
references 2
Gobron, S., Ahn, J., Paltoglou, G., Thelwall, M. & Thalmann, D. (in press). From sentence to emotion: A real-time three-dimensional graphics metaphor of emotions extracted from text. The Visual Computer: International Journal of Computer Graphics.Thelwall, M. (2009). MySpace comments. Online Information Review, 33(1), 58-76.Thelwall, M. (2008). Social networks, gender and friending: An analysis of MySpace member profiles, Journal of the American Society for Information Science and Technology, 59(8), 1321-1330.
http://www.danah.org/researchBibs/sns.html