Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different!...
-
date post
19-Dec-2015 -
Category
Documents
-
view
219 -
download
0
Transcript of Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different!...
![Page 1: Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion.](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649d2e5503460f94a0519a/html5/thumbnails/1.jpg)
Hasan T Karaoglu
Epidemics in Blogspace
![Page 2: Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion.](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649d2e5503460f94a0519a/html5/thumbnails/2.jpg)
IntroductionBlogs are different!Methods are different!Contents are different!Some methods on Some Content of Some
BlogsDiscussion
Outline
![Page 3: Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion.](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649d2e5503460f94a0519a/html5/thumbnails/3.jpg)
Blogs are a popular way to share personal journals, discuss matters of public opinion, have collaborative conversations,aggregate content on similar topics.
Blogs also disseminatenew content novel ideas
How does content spread across, what kinds of content spreads, and at what rate?
Introduction
![Page 4: Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion.](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649d2e5503460f94a0519a/html5/thumbnails/4.jpg)
Epidemics : one way of modeling these aspects
Physics of Information DiffusionDisease Propagation Model
SusceptibleInfectedRecoveredMutation?
Threshold Model for Social Networks
Introduction - Epidemics
![Page 5: Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion.](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649d2e5503460f94a0519a/html5/thumbnails/5.jpg)
Youtube, Flickr (Content Sharing )AmazonCNN, MSNBC (Web)Linkedln (Professional Networking)Orkut, Facebook, Yonja (Social Networking)Twitter (?)Blogger, Blogspot, LiveJournal, Slashdot
(Blogspace)
Blogs are different
![Page 6: Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion.](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649d2e5503460f94a0519a/html5/thumbnails/6.jpg)
Blogs are different
High level of reciprocitySymmetric indegree – outdegreeIn contrast to Web (high authority sites)
![Page 7: Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion.](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649d2e5503460f94a0519a/html5/thumbnails/7.jpg)
Blogs are different
![Page 8: Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion.](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649d2e5503460f94a0519a/html5/thumbnails/8.jpg)
Blogs are different
Average Path Length is very short in compared to Web.(Directionality ?)
![Page 9: Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion.](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649d2e5503460f94a0519a/html5/thumbnails/9.jpg)
Blogs are different
Joint Degree Distribution
(High Degree Nodes Connect to
Other High Degree Nodes)
Epidemics on Network Core?
Youtube Celebrities?
![Page 10: Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion.](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649d2e5503460f94a0519a/html5/thumbnails/10.jpg)
Blogs are different
Strongly Connected Core Analysis
• Slowly Increasing Shortest Path
•High Clustering
![Page 11: Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion.](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649d2e5503460f94a0519a/html5/thumbnails/11.jpg)
Blogs are different
Strong Local Clustering(people tend to be introduced to other
people via mutual friends)
![Page 12: Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion.](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649d2e5503460f94a0519a/html5/thumbnails/12.jpg)
EpidemicsGossipInfluence Map (Word of Mouth)Recommendation Based Web (Data) MiningMathematical Modeling (Markov Chains,
Information Theory, …)…
Methods are different
![Page 13: Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion.](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649d2e5503460f94a0519a/html5/thumbnails/13.jpg)
Contents are differentRecommendationNews (Political, Fun,
Paparazzi)GossipMedia (Music, News,
Excerpts)
![Page 14: Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion.](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649d2e5503460f94a0519a/html5/thumbnails/14.jpg)
Infection Inference technique introduced by Adamic et al.Link inferenceLink classificationClassifier training Problems and Challenges
Some methods on Some Content of Some Blogs
![Page 15: Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion.](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649d2e5503460f94a0519a/html5/thumbnails/15.jpg)
Some methods on Some Content of Some BlogsPattern Used for Classifier Training
The number of common blogs explicitly linked to by both blogs (indicating whether two blogs are in the same community)
The number of non-blog links (i.e. URLs) shared by the two
Text similarityOrder and frequency of repeated infections.Specifically, the number of times one blog mentions
a URL before the other and the number of timesThey both mention the URL on the same day. In-link and out-link counts for the two blogs
![Page 16: Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion.](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649d2e5503460f94a0519a/html5/thumbnails/16.jpg)
Some methods on Some Content of Some BlogsText Similarity
s(A,B) = nAB / √nA / √nB
![Page 17: Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion.](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649d2e5503460f94a0519a/html5/thumbnails/17.jpg)
Some methods on Some Content of Some BlogsTiming of Infection
![Page 18: Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion.](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649d2e5503460f94a0519a/html5/thumbnails/18.jpg)
Some methods on Some Content of Some Blogs
Link Inference Blog URL and Text Similarity PatternsThree-way Classifier (57%)
reciprocated links, one way links, unlinked pairs
Two-way Classifier (SVM 91.2% Logistic Regression 91.9%) linkedunlinked pairs
Infection Inference nA-before-B /nA, nA-after-B /nA, nA-same-day-B /nA Timing Patterns (75%)with all 6 timing patterns and text/blog similarity patterns
(61 – 75%)link-in / link-out counts
![Page 19: Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion.](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649d2e5503460f94a0519a/html5/thumbnails/19.jpg)
Some methods on Some Content of Some Blogs
Visualization Heuristics using classifiersTwo types of graph
Directed Acyclic GraphMost likely tree
![Page 20: Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion.](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649d2e5503460f94a0519a/html5/thumbnails/20.jpg)
Some methods on Some Content of Some Blogs
Epidemic Propagation Model by Gruhl et al.TopicsIndividuals
TopicsTopic = Chatter + Spike + (Resonance)
![Page 21: Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion.](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649d2e5503460f94a0519a/html5/thumbnails/21.jpg)
Some methods on Some Content of Some Blogs
Epidemic Propagation Model by Gruhl et al.TopicsIndividuals
TopicsTopic = Chatter + Spike + (Resonance)
![Page 22: Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion.](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649d2e5503460f94a0519a/html5/thumbnails/22.jpg)
Some methods on Some Content of Some Blogs
![Page 23: Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion.](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649d2e5503460f94a0519a/html5/thumbnails/23.jpg)
Some methods on Some Content of Some Blogs
aoccdrnig to rscheearch at an elingsh uinervtisy it deosn’t mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is taht the frist and lsat ltteer is at the rghit pclae
![Page 24: Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion.](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649d2e5503460f94a0519a/html5/thumbnails/24.jpg)
Some methods on Some Content of Some Blogs
Power-law Characteristic for Individuals
Different Posting Behaviors for Individuals
![Page 25: Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion.](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649d2e5503460f94a0519a/html5/thumbnails/25.jpg)
Some methods on Some Content of Some BlogsPropagation Model
Cascading Model
Copy Probability κ(v,w)Noticing Probability r(v,w)
For 7K topics, r mean 0.28 and std 0.22,κ quite low, mean 0.04 and std 0.07,Even bloggers who commonly read from
another source are selective in the topics they choose to write about.
![Page 26: Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion.](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649d2e5503460f94a0519a/html5/thumbnails/26.jpg)
Could we use these models to extract further pattern or characteristics ?Classification of Hoax, Fake News ?Prediction of Popular songs, videos at their
inception…..
Discussion
![Page 27: Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion.](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649d2e5503460f94a0519a/html5/thumbnails/27.jpg)
Thanks!
Q & A
![Page 28: Hasan T Karaoglu. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion.](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649d2e5503460f94a0519a/html5/thumbnails/28.jpg)
D. W. Drezner, and H. Farrell, “Web of Influence,” Foreign Policy, vol. 145, pp. 32-40, Dec. 2004
E. Adar and L. A. Adamic, “Tracking Information Epidemics in Blogspace,” Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence, pp. 207–214, 2005.
D. Gruhl, R. Guha, D. Liben-Nowell, and A. Tomkins, “Information diffusion through blogspace,” Proceedings of the 13th international conference on World Wide Web, pp. 491-501,2004.
A. Mislove, M. Marcon, K. P. Gummadi, P. Druschel, and B. Bhattacharjee, “Measurement and Analysis of Online Social Networks,” Proceedings of the 7th ACM SIGCOMM conference on Internet measurement, pp. 29-42, 2007
M. Cha, J. A. N. Perez, and H. Haddadi, "Flash Floods and Ripples: The Spread of Media Content through the Blogosphere", 3rd Int'l AAAI Conference on Weblogs and Social Media (ICWSM) Data Challenge Workshop, May 17 - 20, 2009, San Jose, CaliforniaM. Young, The Technical Writer's Handbook. Mill Valley, CA: University Science, 1989.
Z. Fanzi, Q. Zhengding, L. Dongsheng, and Y. Jianhai, “Shape-based time series similarity measure and pattern discovery algorithm”, Journal of Electronics, vol. 22, pp. 142-148, Aug. 2007
References