Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf ·...

30
Measurement and Collection of Social Network Data Pawan Goyal CSE, IITKGP July 22, 2014 Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 1 / 25

Transcript of Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf ·...

Page 1: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

Measurement and Collection of Social Network Data

Pawan Goyal

CSE, IITKGP

July 22, 2014

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 1 / 25

Page 2: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

Social Network Site (SNS)

“A web-based services that allow individuals to (1) construct a public orsemi-public profile within a bounded system, (2) articulate a list of other userswith whom they share a connection, and (3) view and traverse their list ofconnections and those made by others within the system.” (Boyd and Elison,2007, p. 211)

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 2 / 25

Page 3: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

What People Do in Social Media

People make “friends” with others and build social relationships,connections and communities.

People ask and answer one another.

People create, publish or distribute information in the form of text, photos,video, audio, or tweets.

People share bookmarks, presentation slides, or other files.

People provide feedback on or rate others’ information.

People create social tags or folksonomies.

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 3 / 25

Page 4: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

Social Networking Websites

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 4 / 25

Page 5: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

Reference for this lecture

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 5 / 25

Page 6: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

Tools Used

IPhython NotebookMay use the following url: http://smash.psych.nyu.edu/courses/spring12/modeling/ipythonhints.html

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 6 / 25

Page 7: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

Twitter

A microblogging service, allowing people to communicate with short,140-character messages that roughly correspond to thoughts or ideas.

Twitter’s relationship modelAllows you to keep up with the latest happenings of any other user, eventhough the other user may not choose to follow you back or even know thatyou exist.

Interest GraphsA way of modeling connections between people and their arbitrary interests.

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 7 / 25

Page 8: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

Twitter

A microblogging service, allowing people to communicate with short,140-character messages that roughly correspond to thoughts or ideas.

Twitter’s relationship model

Allows you to keep up with the latest happenings of any other user, eventhough the other user may not choose to follow you back or even know thatyou exist.

Interest GraphsA way of modeling connections between people and their arbitrary interests.

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 7 / 25

Page 9: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

Twitter

A microblogging service, allowing people to communicate with short,140-character messages that roughly correspond to thoughts or ideas.

Twitter’s relationship modelAllows you to keep up with the latest happenings of any other user, eventhough the other user may not choose to follow you back or even know thatyou exist.

Interest GraphsA way of modeling connections between people and their arbitrary interests.

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 7 / 25

Page 10: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

Twitter

A microblogging service, allowing people to communicate with short,140-character messages that roughly correspond to thoughts or ideas.

Twitter’s relationship modelAllows you to keep up with the latest happenings of any other user, eventhough the other user may not choose to follow you back or even know thatyou exist.

Interest GraphsA way of modeling connections between people and their arbitrary interests.

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 7 / 25

Page 11: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

Fundamental Twitter Terminology

Tweets are the essence of Twitter. In addition to the textual content, we get:

Tweet entitiesUser mentions, hashtags, URLs and media

PlacesLocations in the real world, attached to a tweet

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 8 / 25

Page 12: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

Fundamental Twitter Terminology

@ptwobrussell is writing @SocialWebMining, 2nd Ed. from his home office inFranklin, TN. Be #social: http://on.fb.me/16WJAf9

Tweet entitiesUser mentions: @ptwobrussell, @SocialWebMining

hashtag: #social

URL: http://on.fb.me/16WJAf9

Place: Franklin, Tennesee, in the tweet and a location metadata, wherethe tweet is authored

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 9 / 25

Page 13: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

Fundamental Twitter Terminology

TimelinesChronologically sorted collections of tweets

home timeline: view that you see when you log into your account andlook at all the tweets from users that you are following:https://twitter.com

user timeline: collection of tweets only from a certain user:https://twitter.com/timoreilly

User’s home timeline: can be accessed with the additional followingsuffix, appended to the URL. :https://twitter.com/timoreilly/following

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 10 / 25

Page 14: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

user timeline

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 11 / 25

Page 15: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

user home timeline

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 12 / 25

Page 16: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

TweetDeck: a highly customizable user interface

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 13 / 25

Page 17: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

Streams

Samples of public tweets flowing through Twitter in realtime

Public firehoseKnown to peak at hundreds of thousands of tweets per minute during eventswith particularly wide interest.

Public timelineA small random sample of the public timeline is available, that providesfilterable access to enough public data for API developers

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 14 / 25

Page 18: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

Creating Twitter API connection

Create an application at https://dev.twitter.com/apps.

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 15 / 25

Page 19: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

Key pieces of information

Create a new Twitter application to get OAuth credentials and API access

Consumer key

Consumer secret

access token

access token secret

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 16 / 25

Page 20: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

Exploring Twitter data

https://github.com/ptwobrussell/Mining-the-Social-Web-2nd-EditionRequires : IPython Notebook

Things to try outExploring Trending Topics

Searching for tweets using a query

Analyzing he 140 characters

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 17 / 25

Page 21: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

Exploring Twitter data

Computing Lexical DiversityNumber of unique tokens in the text, divided by the total number of tokens inthe text

Can be a primitive statistic forHow broad or narrow the subject matter is that an individual or groupdiscusses

Comparing different groups/individuals

Comparing across time periods

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 18 / 25

Page 22: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

Accessing Facebook user’s account data

Register an application as the entry point into the Facebook developerplatform.

Data accessibleWhatever the user has explicitly authorized it to access.As a developer, the application would be like any of your Facebook friends, inthat you are ultimately in control of what the application can access.

Developer ToolsGraph API Explorer app for querying the Social GraphYou can translate your queries into Python code for automation

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 19 / 25

Page 23: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

Accessing Facebook user’s account data

Register an application as the entry point into the Facebook developerplatform.

Data accessibleWhatever the user has explicitly authorized it to access.As a developer, the application would be like any of your Facebook friends, inthat you are ultimately in control of what the application can access.

Developer ToolsGraph API Explorer app for querying the Social Graph

You can translate your queries into Python code for automation

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 19 / 25

Page 24: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

Accessing Facebook user’s account data

Register an application as the entry point into the Facebook developerplatform.

Data accessibleWhatever the user has explicitly authorized it to access.As a developer, the application would be like any of your Facebook friends, inthat you are ultimately in control of what the application can access.

Developer ToolsGraph API Explorer app for querying the Social GraphYou can translate your queries into Python code for automation

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 19 / 25

Page 25: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

Graph API query

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 20 / 25

Page 26: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

Graph API query

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 21 / 25

Page 27: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

Graph API query

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 22 / 25

Page 28: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

Graph API query

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 23 / 25

Page 29: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

Analyzing Facebook Pages

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 24 / 25

Page 30: Measurement and Collection of Social Network Datacse.iitkgp.ac.in/~pawang/courses/SC14/lec2.pdf · Whatever the user has explicitly authorized it to access. As a developer, the application

Further Exploration

Things to try outAnalyzing things your friends “like”: top ten things, categories

Common likes between an ego and its friendship

Pawan Goyal (IIT Kharagpur) Measurement and Collection of Social Network Data July 22, 2014 25 / 25