ESSIR 2013 - IR and Social Media
-
Upload
arjen-de-vries -
Category
Education
-
view
888 -
download
0
description
Transcript of ESSIR 2013 - IR and Social Media
![Page 1: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/1.jpg)
9th European Summer School in Information Retrieval September 4th, 2013
http://bit.ly/ESSIR13IRSocMedia
IR and Social Media
Arjen P. de [email protected]
Centrum Wiskunde & InformaticaDelft University of Technology
Spinque B.V.
![Page 2: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/2.jpg)
On slideshare,IR = Investor Relations
![Page 3: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/3.jpg)
Social Media
Noun
social media (plural only)
Interactive forms of media that allow users to interact with and publish to each other, generally by means of the Internet.
The early 21st century saw a huge increase in social media thanks to the widespread availability of the Internet.
![Page 4: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/4.jpg)
http://www.webanalyticsworld.net/2010/11/history-of-social-media-infographic.html
![Page 5: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/5.jpg)
Social Media
“Social bookmarking” sites “User generated content”
Images (flickr) and videos (youtube, vimeo), but also blogs
Social network services Twitter, facebook
![Page 6: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/6.jpg)
Not just one beast!
![Page 7: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/7.jpg)
![Page 8: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/8.jpg)
![Page 9: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/9.jpg)
IR and Social Media?
![Page 10: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/10.jpg)
Red Hot Chili Peppers
![Page 11: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/11.jpg)
“Rock group” in author’s metadata...
Organisation in groups may help
disambiguate query!
More implicit metadata...
![Page 12: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/12.jpg)
Information Science
“Search for the fundamental knowledge which will allow us to postulate and utilize the most efficient combination of [human and machine] resources”
M.E. Senko. Information systems: records, relations, sets, entities, and things. Information systems, 1(1):3–13, 1975.
![Page 13: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/13.jpg)
Core Questions
How to represent information? The information need and search requests The objects to be shown in response to an
information request
How to match information representations?
![Page 14: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/14.jpg)
IR and Social Media
Richer information representations!
![Page 15: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/15.jpg)
Richer representations
User profiles User name, full name, description, image,
homepage url, etc.
Connections between users Networks of friends, followers, etc
Comments/reactions Endorsing and sharing
![Page 16: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/16.jpg)
Q: Web ancient social media?
![Page 17: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/17.jpg)
(C) 2008, The New York Times Company
Anchor tekst: “continue reading”
![Page 18: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/18.jpg)
Not a lot of info to represent the page…
Een fan’s hyves page:Kyteman's HipHop Orchestra: www.kyteman.com
Kaartverkoop luxor theater:22 mei - Kyteman's hiphop Orkest - www.kyteman.com
Kluun.nl:De site van Kyteman
Blog Rockin’ Beats:De 21-jarige Kyteman (trompettist, componist en Producer Colin Benders), heeft drie jaar gewerkt aan zijn debuut:the Hermit sessions.
Jazzenzo:...een optreden van het populaireKyteman’s Hiphop Orkest
![Page 19: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/19.jpg)
![Page 20: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/20.jpg)
‘Co-creation’
Social Media: Consumer becomes a co-creator ‘Data consumption’ traces
In essence: many new sources to play the role of anchor text Tags and/or ratings Tweets Comments, reviews
![Page 21: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/21.jpg)
Potential Benefits for IR
Expand content representation Reduce the vocabulary gap(s) between
creators of content, indexers, and users More diverse views on the same content
![Page 22: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/22.jpg)
Potential Benefits for IR
Relevance depends on user context User task User knowledge
![Page 23: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/23.jpg)
Potential Benefits for IR
Relevance depends on user context User task User knowledge
Social media provide an opportunity to make much better assumptions about user context A specific user’s context The variety of user contexts that may exist
![Page 24: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/24.jpg)
Maarten Clements, Arjen P. de Vries and Marcel J.T. Reinders.
The task dependent effect of tags and ratings on social media access.
TOIS 28, 4, article 21 (November 2010), 42 pages.
![Page 25: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/25.jpg)
LibraryThing
![Page 26: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/26.jpg)
LibraryThing
Items People Tags Ratings
See also: http://www.macle.nl/tud/LT/
![Page 27: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/27.jpg)
Synonyms
![Page 28: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/28.jpg)
Synonyms
![Page 29: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/29.jpg)
![Page 30: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/30.jpg)
Examples
Humour
Classic
![Page 31: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/31.jpg)
LibraryThing
Items People Tags Ratings
See also: http://www.macle.nl/tud/LT/
![Page 32: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/32.jpg)
![Page 33: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/33.jpg)
Search with Random Walk
Present nodes according to estimated probability that a random walk that starts from (task dependent) starting nodes, would end at this node
E.g., tag suggestion starts in a tag node; personalized search in tag and user nodes
![Page 34: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/34.jpg)
Tagging Relationships
![Page 35: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/35.jpg)
![Page 36: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/36.jpg)
An item recommendation walk
![Page 37: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/37.jpg)
Ratings
Ratings may enhance the graph, or just be used for evaluation
![Page 38: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/38.jpg)
Personalized Search
Assume a user who types a single tag as query
![Page 39: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/39.jpg)
Personalized Search
![Page 40: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/40.jpg)
A soft clustering effect smoothly relates similar concepts before converging to the background probability
![Page 41: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/41.jpg)
Homographs like “Java” are disambiguated because the walk starts in both the query tag and the target user So, content that matches the user’s
preference is more likely to be found first
![Page 42: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/42.jpg)
Common System Designs
![Page 43: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/43.jpg)
Analysis results
Allowing all users to tag all available content improves retrieval tasks
Combining tags and ratings may improve both search and recommendation tasks
![Page 44: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/44.jpg)
Ternary relation lost!
The UIT matrix represents a ternary relation, that is lost when creating the three UI, IT and UT matrices
![Page 45: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/45.jpg)
Ternary relation lost!
The UIT matrix represents a ternary relation, that is lost when creating the three UI, IT and UT matrices Potentially a problem if tags express opinion
about an item; e.g., “poetry” can independent from item still describe
the user “awful” requires to know what item the term
belongs to
![Page 46: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/46.jpg)
![Page 47: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/47.jpg)
Tags vs. rating
Most tags do not deviate far from the mean rating
Only few tags strongly correlated with opinion Note: poetry higher quality than chicklit
![Page 48: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/48.jpg)
Metadata
Scientific articles have many types of metadata associated: Abstract Author Booktitle Description Journal Tags
Are all these types of metadata useful for item recommendation?
![Page 49: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/49.jpg)
Metadata
According to Toine Bogers’ PhD thesis: Concatenate all fields associated to a single
user’s profile’s items into one huge text field, and use an off-the-shelf IR model to match the profile against metadata of the items.“Profile-centric Matching”
Or, construct item profiles from meta-data of all users for that item, and apply an item-based collaborative filtering approach“Item-based Hybrid Filtering”
Author, description, tags, title, url, journal and booktitle all contribute
![Page 50: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/50.jpg)
Finally: a recent case study
![Page 51: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/51.jpg)
Artist Popularity?
Let’s ask widely used social media music platforms! I.e., query their APIs
![Page 52: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/52.jpg)
![Page 53: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/53.jpg)
Artist Popularity (1-3)
Top-5 popular artists in dataset Jan 21 – Mar 21 3 hourly timestamped popularity indices
![Page 54: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/54.jpg)
http://bit.ly/ESSIR13IRSocMedia
![Page 55: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/55.jpg)
Artist Popularity
![Page 56: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/56.jpg)
Artist Popularity (?!)
Top-5 popular artists in dataset Jan 21 – Mar 21 3 hourly timestamped popularity indices
![Page 57: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/57.jpg)
The Black Keys
![Page 58: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/58.jpg)
The Black Keys
Three grammy awards received!
![Page 59: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/59.jpg)
The Black Keys
Web responds, while service based popularity index is static
![Page 60: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/60.jpg)
Implications
An “artist popularity” index depends on the platform and its user population
Web based popularity – estimated via URL shortener’s API – “reacts” to real-world events Suitable as an academics’ search log
replacement?
![Page 61: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/61.jpg)
Implications
An “artist popularity” index depends on the platform and its user population
Web based popularity – estimated via URL shortener’s API – “reacts” to real-world events Suitable as an academics’ search log
replacement?
Q: What is the most useful popularity – one that changes dynamically or one that lasts?
![Page 62: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/62.jpg)
![Page 63: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/63.jpg)
Many topics I skipped…
![Page 64: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/64.jpg)
![Page 65: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/65.jpg)
Tweets about blip.tv
“Twanchor text” E.g.: http://blip.tv/file/2168377
Amazing Watching “World’s most realistic 3D city
models?” Google Earth/Maps killer Ludvig Emgard shows how maps/satellite pics
on web is done (learn Google and MS!) and ~120 more Tweets
![Page 66: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/66.jpg)
Wikipedia
Wikipedia contains semantically very rich annotations: Wikipedia Categories Wikipedia Lists Times (1930, 1931, 1932, etc. etc.) Names Disambiguation pages
Etc.
Note: DBPedia is just Wikipedia
![Page 67: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/67.jpg)
Wikipedia
People have used Wikipedia edit history to look for events
![Page 68: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/68.jpg)
Geotags / POIs
Many social media items carry explicit geo information Geotags are low-level “coordinates” POIs are high-level “point-of-interest” labels
Applications Recommend geo-locations to people Predict POI tags from (tweet) text Predict where a user will go next
![Page 69: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/69.jpg)
Map text to locations
Build a language model from all tags assigned to flickr images that belong to a predefined grid cell
Neighbouring cells used for smoothing (like hierarchic language models used previously for video / scene / shot)
User frequency of a term in a location (instead of term frequency)
Neil O’Hare and Vanessa Murdock
Modeling Locations with Social Media
Information Retrieval, February 2013, Volume 16, Issue 1, pp 30-62
![Page 70: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/70.jpg)
Placing Images: Easyhttp://www.flickr.com/photos/63666148@N00/3615989115/
Athens, Ohio or Athens, Greece?
![Page 71: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/71.jpg)
Placing Images: Hard
Ballooning company in Ottawa
![Page 72: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/72.jpg)
Searching the Social Graph
Search entities, and the relationships between them, in the (facebook) social graph
Clearly IR problems, but who has the data to work with?
Micheal Curtiss et al.
Unicorn: A System for Searching the Social Graph
PVLDB, Vol. 6, No. 11
![Page 73: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/73.jpg)
Crawling
How to get “the” data? Rate limited APIs ToS
HEADACHES!
![Page 74: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/74.jpg)
Fred Morstatter, Jürgen Pfeffer, Huan Liu and Kathleen M. Carley
Is the Sample Good Enough? Comparing Data from Twitter’s Streaming API with Twitter’s Firehose
ICWSM 2013
![Page 75: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/75.jpg)
Not IR yet, but… Interesting stuff nevertheless!
de Volkskrant, March 13, 2013
Michal Kosinski, David Stillwell, and Thore Graepel
Private traits and attributes are predictable from digital records of human behavior
PNAS 2013 ; published ahead of print March 11, 2013, doi:10.1073/pnas.1218772110
![Page 76: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/76.jpg)
Take home message(s)
![Page 77: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/77.jpg)
Take home message(s)
Social media give us IR researchers access to a rich resource of context Including time & location!
![Page 78: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/78.jpg)
Take home message(s)
Social media give us IR researchers access to a rich resource of context Including time & location!
Gather the right data for your problem domain, and it may be a good alternative for not having the click data we all want so badly
![Page 79: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/79.jpg)
Take home message(s)
Social media give us IR researchers access to a rich resource of context Including time & location!
Gather the right data for your problem domain, and it may be a good alternative for not having the click data we all want so badly
Various recommendation and retrieval tasks exist in social media – can one theory address all of these?
![Page 80: ESSIR 2013 - IR and Social Media](https://reader037.fdocuments.net/reader037/viewer/2022110115/5492540cb47959f2518b458e/html5/thumbnails/80.jpg)
C U @ #ECIR2014 ? !