The spread of information on Twitter based on sentimentncuwm/21stAnnual/presentation-library/... ·...
Transcript of The spread of information on Twitter based on sentimentncuwm/21stAnnual/presentation-library/... ·...
-
The spread of information on Twitter based onsentiment
Haley Knox
Eastern Connecticut State University
Mentors: Dr. Garrett Dancik and Dr. Megan Heenehan
January 26, 2019
Haley Knox (Eastern CT State University) Twitter sentiment analysis January 26, 2019 1 / 22
-
Twitter
A tweet can be spread by ‘liking’, replying, quoting or retweeting.Some stories spread across the world almost instantaneously.One Direction star Harry Styles tweeted about his band’s breakup andin fifteen seconds, it moved from the United States to just about everycorner of the planet [1].We look to discover features of what makes information spread onTwitter.
Haley Knox (Eastern CT State University) Twitter sentiment analysis January 26, 2019 2 / 22
-
Sentiment
We determine if sentiment impacts whether or not a user retweets.Are positive or negative tweets more likely to be retweeted?Is the number of retweets correlated with sentiment?
We classify the sentiment of a tweet as positive, negative or neutraland analyze how these tweets spread.We use the package sentimentr in R to calculate the sentiment oftweets.
Haley Knox (Eastern CT State University) Twitter sentiment analysis January 26, 2019 3 / 22
-
Examples
Negative sentiment:“Because unlike you lazy lot they won’t sigh and go “it is well” afterbeing owed salaries. They’ll fight. Throw chairs. Threatenimpeachment. Scheme. They are not lazy nor foolish like we thinkthem. They know their strengths and apply it accordingly.”
-1.668267
Positive sentiment:“So nice getting to meet the brilliant Eddie Redmayne! Absolutely loveFantastic Beasts: The Crimes Of Grindelwald! #FantasticBeasts#ProtectTheSecrets #BeFantastic”
1.4546
Haley Knox (Eastern CT State University) Twitter sentiment analysis January 26, 2019 4 / 22
-
Network analysis
We build networks for 20 positive and 20 negative tweets.We use graph theory measures to see if there are any measurabledifferences between the networks of positive and negative tweets.
Haley Knox (Eastern CT State University) Twitter sentiment analysis January 26, 2019 5 / 22
-
How we build our networks
Twitter doesn’t accurately show whom someone retweeted.According to Twitter, everyone retweets the author of the tweet, sothe retweet network is a star graph.
Retweet network based on Twitter’s information. This is a star graph.
Haley Knox (Eastern CT State University) Twitter sentiment analysis January 26, 2019 6 / 22
-
How we build our networks
Twitter doesn’t accurately show whom someone retweeted.According to Twitter, everyone retweets the author of the tweet, sothe retweet network is a star graph.
Retweet network based on Twitter’s information. This is a star graph.
Haley Knox (Eastern CT State University) Twitter sentiment analysis January 26, 2019 6 / 22
-
Our networks
1 We consider all of the retweeters of a tweet.2 We get the ‘friends’ of all the retweeters.3 If someone is a friend of a retweeter and is also a retweeter, then we
form an edge between them.
This shows where a user saw the tweet from and who they actuallyretweeted.
Haley Knox (Eastern CT State University) Twitter sentiment analysis January 26, 2019 7 / 22
-
The data
We collected 12,000 random tweets at 12pm and 10pm every day fortwo weeks.We built networks for one positive and negative tweet from eachsample that had 50-100 retweets and a sentiment score > |0.9|.
0
5000
10000
15000
Sund
ay
Mon
day
Tues
day
Wed
nesd
ay
Thur
sday
Frid
ay
Satu
rday
Day
Cou
nt
Sentiment
Negative
Positive
Tweets sampled each day
Haley Knox (Eastern CT State University) Twitter sentiment analysis January 26, 2019 8 / 22
-
The data
We collected 12,000 random tweets at 12pm and 10pm every day fortwo weeks.We built networks for one positive and negative tweet from eachsample that had 50-100 retweets and a sentiment score > |0.9|.
0
5000
10000
15000
Sund
ay
Mon
day
Tues
day
Wed
nesd
ay
Thur
sday
Frid
ay
Satu
rday
Day
Cou
nt
Sentiment
Negative
Positive
Tweets sampled each day
Haley Knox (Eastern CT State University) Twitter sentiment analysis January 26, 2019 8 / 22
-
0
20000
40000
60000
−1 0 1Sentiment
Histogram for sentiment of all data
The distribution of sentiment for all of our samples.
Positive Negative Neutral98,038 61,960 49,039
The number of tweets in our data set by sentiment.
Haley Knox (Eastern CT State University) Twitter sentiment analysis January 26, 2019 9 / 22
-
0
20000
40000
60000
−1 0 1Sentiment
Histogram for sentiment of all data
The distribution of sentiment for all of our samples.
Positive Negative Neutral98,038 61,960 49,039
The number of tweets in our data set by sentiment.
Haley Knox (Eastern CT State University) Twitter sentiment analysis January 26, 2019 9 / 22
-
Saturday
Wednesday Thursday Friday
Sunday Monday Tuesday
0
5
10
15
0
5
10
15
0
5
10
15
Sentiment
log(
retw
eet c
ount
+ 1
)
Retweet count by sentiment for each day
The number of retweets on the log scale by sentiment for each day of the week.We consider tweets with or without any retweets.
Haley Knox (Eastern CT State University) Twitter sentiment analysis January 26, 2019 10 / 22
-
0
5
10
15
Sentiment
log(
retw
eet c
ount
+ 1
)Retweet count by sentiment
The number of retweets on the log scale by sentiment for our entire dataset. Weconsider tweets with or without any retweets.
Haley Knox (Eastern CT State University) Twitter sentiment analysis January 26, 2019 11 / 22
-
Likelihood of retweet based on sentiment
We previously determined that the greater the polarity of the tweet,the more retweets that tweet will receive.Now we look into if a tweet is more likely to be retweeted based on ifit is positive or negative.Recall that there are more positive tweets in our data set thannegative tweets, so we will use proportions.
Haley Knox (Eastern CT State University) Twitter sentiment analysis January 26, 2019 12 / 22
-
0.00
0.25
0.50
0.75
1.00
nega
tive
neut
ral
posi
tive
Sentiment
Pro
port
ion Retweet count > 1000
No
Yes
Retweet count > 1000 by sentiment
The proportion of negative, neutral, and positive tweets that are retweeted morethan 1000 times.
Haley Knox (Eastern CT State University) Twitter sentiment analysis January 26, 2019 13 / 22
-
Results
15% of negative tweets are retweeted more than 1000 times.Only 12.3% of positive tweets are retweeted more than 1000 times.This difference is statistically significant (Fisher Test results in ap-value of 0.0004997).Therefore, negative tweets are more likely than positive tweets to beretweeted more than 1000 times.The same is true for retweet count > 1 and retweet count > 100.
Haley Knox (Eastern CT State University) Twitter sentiment analysis January 26, 2019 14 / 22
-
Network analysis
We calculate the Pearson correlation between sentiment score and:group betweennessgroup degreemodularitygroup closenessthe number of communitiesdensityaverage clustering coefficient
Haley Knox (Eastern CT State University) Twitter sentiment analysis January 26, 2019 15 / 22
-
r(pos) p-value(pos)
r(neg) p-value(neg)
Difference p-value(diff)
Betweenness 0.1771 0.4821 0.0797 0.761 0.0973 0.7898Degree 0.0374 0.883 0.3379 0.1846 -0.3006 0.3976Modularity 0.1368 0.5884 0.3366 0.01186 -0.1998 0.5672Closeness -0.1364 0.5894 -0.5944 0.1865 0.458 0.1409Communities 0.1995 0.4273 -0.3865 0.1254 0.5861 0.1007Density -0.1042 0.6807 0.2824 0.2721 -0.3866 0.288Avg. C4 -0.2399 0.3375 0.3732 0.1401 -0.6132 0.0866
Pearson correlation of graph theoretic measures compared to sentiment score.
Haley Knox (Eastern CT State University) Twitter sentiment analysis January 26, 2019 16 / 22
-
0.000.050.100.150.200.25
negative positiveSentimentG
roup
bet
wee
nnes
s
Betweenness
0.000.250.500.75
negative positiveSentiment
Mod
ular
ity
Modularity
48
12
negative positiveSentiment
Com
mun
ities
Communities
0.250.500.751.00
negative positiveSentiment
Gro
up d
egre
e Degree
0.00.20.40.60.8
negative positiveSentimentA
vg. c
lust
erin
g co
eff.
Avg. clustering coefficient
0.000
0.005
0.010
negative positiveSentiment G
roup
clo
sene
ss Closeness
Comparing positive and negative sentiments among our chosen network measures.
Haley Knox (Eastern CT State University) Twitter sentiment analysis January 26, 2019 17 / 22
-
Network example
Network for a tweet by @AubreyKMiller. The size of the vertices represents thenumber of followers of that user. The lighter the color, the more friends that userhas.
Haley Knox (Eastern CT State University) Twitter sentiment analysis January 26, 2019 18 / 22
-
Network example
The tweet is by @AubreyKMiller: "So nice getting to meet the brilliantEddie Redmayne! Absolutely love Fantastic Beasts: The Crimes OfGrindelwald! #FantasticBeasts #ProtectTheSecrets #BeFantastic."Fantastic Beasts is a movie starring actor Eddie Redmayne.While Aubrey is the original author of the tweet, she is not the mostimportant.@AubreyKMiller only has a degree of 11.@FantasticBeasts has a degree of 56.Our construction of a network is much different than that of a stargraph with @AubreyKMiller as the center vertex.
Haley Knox (Eastern CT State University) Twitter sentiment analysis January 26, 2019 19 / 22
-
Conclusion
There are more positive tweets than negative tweets.Negative tweets are slightly more likely to be retweeted than positivetweets.The larger the absolute value of the sentiment score (i.e. the greaterthe emotional impact), the more retweets the tweet is likely to have.
Haley Knox (Eastern CT State University) Twitter sentiment analysis January 26, 2019 20 / 22
-
Acknowledgments
I would like to thank my mentors Dr. Dancik and Dr. Heenehan for theirhelp and constant support throughout this research project and theNCUWM committee that put together this wonderful conference.
Haley Knox (Eastern CT State University) Twitter sentiment analysis January 26, 2019 21 / 22
-
References
Beres, Damon. (2016) Watch The Amazing Way Information SpreadsOn Twitter. Huffington Post, Mar 20, 2016.https://www.huffingtonpost.com/entry/how-twitter-works_us_56ec6480e4b084c672204d74
A. K. Jose, N. Bhatia, and S. Krishna. (2010) “Twitter SentimentAnalysis”. National Institute of Technology Calicut.
Sarlan, A., Nadam, C., and Basri, S. (2014) Twitter sentimentanalysis. 212-216. doi:10.1109/ICIMU.2014.7066632.
https://www.rdocumentation.org/packages/sentimentr. pages745-750.
Barabási, A.-L. and Pósfai, M. (2016) Network science. Cambridge:Cambridge University Press.
Haley Knox (Eastern CT State University) Twitter sentiment analysis January 26, 2019 22 / 22
IntroductionTwitterRelated research