Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

67
Social Web 2014, Lora Aroyo Lecture IV: How can we MINE, ANALYSE & the Social Web? (1) Lora Aroyo The Network Institute VU University Amsterdam Social Web 2014

description

http://thesocialweb2014.wordpress.com/

Transcript of Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Page 1: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Social Web 2014, Lora Aroyo!

Lecture IV: How can we MINE, ANALYSE &the Social Web? (1)

Lora Aroyo The Network Institute

VU University Amsterdam

Social Web 2014

Page 2: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

• 25 billion tweets on Twitter in 2010, by 175 million users

• 360 billion pieces of contents on Facebook in 2010, by 600 million different users

• 35 hours of videos uploaded to YouTube every minute

• 130 million photos uploaded to flickr per month

Social Web 2014, Lora Aroyo!

The Age of BIG Data

Page 3: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Science with BIG Data

Page 4: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Social Web 2014, Lora Aroyo!

BIG Data Challenges

Page 5: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

enormous wealth of data = lots of insights

• insights in users’ daily lives and activities • insights in history • insights in politics • insights in communities • insights in trends • insights in businesses & brands

Social Web 2014, Lora Aroyo!

Why?

Page 6: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

enormous wealth of data = lots of insights

• who uploads/talks? (age, gender, nationality, community, etc.)

• what are the trending topics? when? • what else do these users like? on which platform? • who are the most/least active users? • ..…

Social Web 2014, Lora Aroyo!

Why?

Page 7: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Image: http://www.co.olmsted.mn.us/prl/propertyrecords/RecordingDocuments/

PublishingImages/forms.jpg

Social Web 2014, Lora Aroyo!

This doesn’t work

Page 8: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Social Web 2014, Lora Aroyo!

How about this?

Page 9: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Social Web 2014, Lora Aroyo!

Who uses it?

Page 10: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Social Web 2014, Lora Aroyo!

Politicians Governmental institutions

Page 11: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Whole society

Social Web 2014, Lora Aroyo!

Page 12: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Whole society

Social Web 2014, Lora Aroyo!

repurposing data

danger of second order effect

Page 13: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Whole society

Social Web 2014, Lora Aroyo!

repurposing data

danger of second order effect

Page 14: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)
Page 15: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Whole society

Social Web 2014, Lora Aroyo!

repurposing data

discoveries & correlations

Web-Scale Pharmacovigilance: Listening to Signals from the Crowd, R.W. White et al (2013)

Page 16: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Whole society

Social Web 2014, Lora Aroyo!

repurposing data

discoveries & correlations

Web-Scale Pharmacovigilance: Listening to Signals from the Crowd, R.W. White et al (2013)

Page 17: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Whole society

Social Web 2014, Lora Aroyo!

repurposing data

discoveries & correlations

Web-Scale Pharmacovigilance: Listening to Signals from the Crowd, R.W. White et al (2013)

Page 18: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Social Web 2014, Lora Aroyo!

Scientists

Bibliometrics

Page 19: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Social Web 2014, Lora Aroyo!

Scientists

Bibliometrics

Page 20: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Social Web 2014, Lora Aroyo!

Scientists

Bibliometrics

Page 21: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Social Web 2014, Lora Aroyo!

Culture History

Page 22: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Social Web 2014, Lora Aroyo!

Culture History

Page 23: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Culture History

Social Web 2014, Lora Aroyo!

Page 24: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Culture History

Social Web 2014, Lora Aroyo!

Page 25: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Culture History

Social Web 2014, Lora Aroyo!

Page 26: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Culture

Social Web 2014, Lora Aroyo!

Bill Howe, University of Washington

Page 27: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Social Web 2014, Lora Aroyo!

Entertainment

Page 28: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Social Web 2014, Lora Aroyo!

Entertainment

Page 29: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Social Web 2014, Lora Aroyo!

Entertainment

Page 30: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

You?

Social Web 2014, Lora Aroyo!

Page 31: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

You?

Social Web 2014, Lora Aroyo!

Page 32: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Companies

Social Web 2014, Lora Aroyo!

Page 33: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Social Web 2014, Lora Aroyo!

Who does it?

Page 34: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Social Web 2014, Lora Aroyo!

The Rise of the Data Scientist

Page 35: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Social Web 2014, Lora Aroyo!

The Rise of the Data Scientist

Page 36: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Social Web 2014, Lora Aroyo!

The Rise of the Data Scientist

Page 37: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Social Web 2014, Lora Aroyo!

The Rise of the Data Scientist

Page 38: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Social Web 2014, Lora Aroyo!

The Rise of the Data Scientist

Data Geeks Skills: Statistics

Data munging Visualisation

Page 39: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

http://radar.oreilly.com/2010/06/what-is-data-science.html

Social Web 2014, Lora Aroyo!

The Rise of the Data Scientist

Page 40: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

• Data Science enables the creation of data products

• Data products are applications that acquire their value from the data, and create more data as a result.

• Users are in a feedback loop: they constantly provide information about the products they use, which gets used in the data product.

Social Web 2014, Lora Aroyo!

Data Science

Page 41: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Social Web 2014, Lora Aroyo!

Data Science Venn Diagram

Drew Conway

Page 42: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Social Web 2014, Lora Aroyo!

Page 43: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Social Web 2014, Lora Aroyo!

Popular Data Products

Data Science is about building products

not just answering questions

Page 44: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Social Web 2014, Lora Aroyo!

Popular Data Products

empower the others to use the data

empower the others to their own analysis

Page 45: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

(Inspired by George Tziralis’ FOSS Conf’09, John Elder IV’s Salford Systems Data Mining Conf. and Toon Calders’ slides)

Data mining is the exploration & analysis of large quantities of data

in order to discover valid, novel, potentially useful, & ultimately understandable patterns in data

http://www.freefoto.com/images/33/12/33_12_7---Pebbles_web.jpgSocial Web 2014, Lora Aroyo!

Data Mining 101

Page 46: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Databases Statistics

Artificial Intelligence

Social Web 2014, Lora Aroyo!

Data Mining 101

• Data input & exploration

• Preprocessing • Data mining algorithms

• Evaluation & Interpretation

Page 47: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

• What data do I need to answer question X?

• What variables are in the data?

• Basic stats of my data?

Social Web 2014, Lora Aroyo!

Data Input & Exploration

“LikeMiner”

Page 48: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

• Cleanup!

• Choose a suitable data model

• What happens if you integrate data from multiple sources?

• Reformat your data

Social Web 2014, Lora Aroyo!

Preprocessing

“LikeMiner”

Page 49: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

• Classification: Generalising a known structure & apply to new data

• Association: Finding relationships between variables

• Clustering: Discovering groups and structures in data

Social Web 2014, Lora Aroyo!

Data Mining Algorithms

Page 50: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

• Filter users by interests

• Construct user graphs

• PageRank on graphs to mine representativeness

• Result: set of influential users

• Compare page topics to user interests to find pages most representative for topics

Social Web 2014, Lora Aroyo!

Mining in “LikeMiner”

Page 51: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Evaluation & InterpretationWhat does the pattern I found mean? • Pitfalls:

• Meaningless Discoveries

• Implication ≠ Causality (Intensive care -> death)

• Simpson’s paradox

• Data Dredging

• Redundancy

• No New Information

• Overfitting

• Bad Experimental Setup

Social Web 2014, Lora Aroyo!

Page 52: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)
Page 53: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Social Web 2014, Lora Aroyo!

Data Mining is not easy

Page 54: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Data Journalism

Social Web 2014, Lora Aroyo!

Page 55: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)
Page 56: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)
Page 57: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Social Web 2014, Lora Aroyo!

Page 58: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

source: http://kunau.us/wp-content/uploads/2011/02/Screen-shot-2011-02-09-

at-9.03.46-PM-w600-h900.png

Social Web 2014, Lora Aroyo!

Mining Social Web Data

Page 59: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Source: http://infosthetics.com/archives/2011/12/all_the_information_facebook_knows_about_you.html See also: http://www.youtube.com/watch?feature=player_embedded&v=kJvAUqs3Ofg

Social Web 2014, Lora Aroyo!

Single Person

Page 60: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

http://www.brandrants.com/brandrants/obama/

Social Web 2014, Lora Aroyo!

Populations

Page 61: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Social Web 2014, Lora Aroyo!

Brand Sentiment via Twitter

http://flowingdata.com/2011/07/25/brand-sentiment-showdown/

Page 62: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Social Web 2014, Lora Aroyo!

Sentiment Analysis as Service

Page 63: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

http://text-processing.com/demo/sentiment/

Social Web 2014, Lora Aroyo!

Page 64: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

http://www.cs.cornell.edu/home/kleinber/networks-book/networks-book.pdf

Social Web 2014, Lora Aroyo!

Recommended Reading

Page 65: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

http://www.actmedia.eu/media/img/text_zones/English/small_38421.jpgSocial Web 2014, Lora Aroyo!

Assignment 2: Semantic Markup• Part I: enrich/create a Web page with semantic markup

• Step 1: Mark up two different Web pages with the appropriate markup describing properties of at least people, relationships to other people, locations, some temporally related data and some multimedia. You can also try out tools such as Google Markup Helper

• Step 2: Validate your semantic markup. Use existing validator. • Step 3: Explain why you chose particular markups. Compare the advantages and disadvantages

of the different markups. Include screenshots from validators. !

• Part II: analyse other team’s Web page markup - as a consumer & as a publisher • Step 1: Perform evaluation and report your findings (consider findability or content extraction) • Step 2: Support your critique with examples of how the semantic markup could be improved. • In introductory section explain what semantic markup is, what it is for, what it looks like etc. • Support your choices and explanations with appropriate literature references. • 5 pages (excluding screen shots). • Other group’s evaluation details in appendix. !

• Deadline: 4 March 23:59

Page 66: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

Image Source: http://blog.compete.com/wp-content/uploads/2012/03/Like.jpgSocial Web 2014, Lora Aroyo!

Final Assignment: Your SocWeb App

• Create your own Social Web app (in a group) • Use structured data, entity relations, data analysis, visualisation • Write individual report on one of the main aspects of your app • Pitch your app idea before finalising: 13 March, during Hands-on • Submit: 28 March 23:59

Page 67: Lecture 4: How can we MINE, ANALYSE & VISUALISE the Social Web? (2014)

image source: http://www.flickr.com/photos/bionicteaching/1375254387/Social Web 2014, Lora Aroyo!

Hands-on Teaser

• Build your own recommender system 101 • Recommend pages on del.icio.us • Recommend pages to your Facebook friends