MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd
-
Upload
raphael-troncy -
Category
Technology
-
view
1.008 -
download
0
description
Transcript of MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd
![Page 1: MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd](https://reader033.fdocuments.net/reader033/viewer/2022051323/549843beb47959514d8b546a/html5/thumbnails/1.jpg)
MediaFinder: Collect, Enrich and Visualize Media Memes
Shared by the Crowd
Raphaël Troncy
[email protected] / @rtroncy
![Page 2: MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd](https://reader033.fdocuments.net/reader033/viewer/2022051323/549843beb47959514d8b546a/html5/thumbnails/2.jpg)
Conferences and natural disaster
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 2
![Page 3: MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd](https://reader033.fdocuments.net/reader033/viewer/2022051323/549843beb47959514d8b546a/html5/thumbnails/3.jpg)
- 3 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro
![Page 4: MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd](https://reader033.fdocuments.net/reader033/viewer/2022051323/549843beb47959514d8b546a/html5/thumbnails/4.jpg)
- 4 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro
![Page 5: MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd](https://reader033.fdocuments.net/reader033/viewer/2022051323/549843beb47959514d8b546a/html5/thumbnails/5.jpg)
- 5 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro
![Page 6: MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd](https://reader033.fdocuments.net/reader033/viewer/2022051323/549843beb47959514d8b546a/html5/thumbnails/6.jpg)
- 6 14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro
![Page 7: MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd](https://reader033.fdocuments.net/reader033/viewer/2022051323/549843beb47959514d8b546a/html5/thumbnails/7.jpg)
Social Media: some definitions
Media Item: a photo or a video that is shared on a social network
Micropost: a text status message that can optionally accompany a media item
Social Network: an online service that focuses on building and reflecting social relationships among people sharing interests or activities Media Sharing Platforms: emphasis on sharing media
but blurred boundaries with social networks since users are encouraged to react on media content (like, comment, favorite, etc.)
Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro 14/05/2013 - 7
![Page 8: MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd](https://reader033.fdocuments.net/reader033/viewer/2022051323/549843beb47959514d8b546a/html5/thumbnails/8.jpg)
Social networks and media items
First-order support: Posting requires the inclusion of a media item Example: Flickr, YouTube
Second-order support: Possibility to post media items but also text-only messages Example: Facebook
Third-order support: No direct support for media items but rely on third party applications
to host them Example: Twitter before the introduction of native photo support
Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro 14/05/2013 - 8
![Page 9: MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd](https://reader033.fdocuments.net/reader033/viewer/2022051323/549843beb47959514d8b546a/html5/thumbnails/9.jpg)
Media Server
Composition of media item extractors (12 SNs) Rely on search APIs + a fix 30s timeout window to provide results Fallback on screen scraping when necessary (Twitter ecosystem)
Implemented as a NodeJS server
Serialize results in a common schema (JSON)
Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro 14/05/2013 - 9
![Page 10: MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd](https://reader033.fdocuments.net/reader033/viewer/2022051323/549843beb47959514d8b546a/html5/thumbnails/10.jpg)
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 10
Deep link Permalink
Clean text for NLP processing
Aggregate view of ALL social interactions
12 Social Networks
![Page 11: MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd](https://reader033.fdocuments.net/reader033/viewer/2022051323/549843beb47959514d8b546a/html5/thumbnails/11.jpg)
Media Finder (www2013)
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 11
![Page 12: MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd](https://reader033.fdocuments.net/reader033/viewer/2022051323/549843beb47959514d8b546a/html5/thumbnails/12.jpg)
Media Finder (zooming on media items)
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 12
![Page 13: MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd](https://reader033.fdocuments.net/reader033/viewer/2022051323/549843beb47959514d8b546a/html5/thumbnails/13.jpg)
Media Finder (timeline view)
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 13
![Page 14: MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd](https://reader033.fdocuments.net/reader033/viewer/2022051323/549843beb47959514d8b546a/html5/thumbnails/14.jpg)
Named Entities are Pivotal
Standalone software GATE Stanford CoreNLP Temis
Web APIs
http://nerd.eurecom.fr/
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 14
![Page 15: MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd](https://reader033.fdocuments.net/reader033/viewer/2022051323/549843beb47959514d8b546a/html5/thumbnails/15.jpg)
What is NERD? REST API2 ontology1
UI3
1 http://nerd.eurecom.fr/ontology 2 http://nerd.eurecom.fr/api/application.wadl 3 http://nerd.eurecom.fr
The NERD ontology has been integrated in the NIF project, a EU FP7 in the context of the LOD2: Creating Knowledge out of Interlinked Data
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 15
![Page 16: MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd](https://reader033.fdocuments.net/reader033/viewer/2022051323/549843beb47959514d8b546a/html5/thumbnails/16.jpg)
NERD REST API
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 16
GET, POST, PUT,
DELETE
/document /user /annotation/{extractor} /extraction /evaluation ...
JSON/RDF*
“entities” : [{ “entity”: “Tim Berners-Lee” , “type”: “Person” , “uri”: "http://dbpedia.org/resource/Tim_berners_lee", “nerdType”: "http://nerd.eurecom.fr/ontology#Person", “startChar”: 30, “endChar”: 45, “confidence”: 1, “relevance”: 0.5 }]
Rizzo G., Troncy R. (2012), NERD: A Framework for Unifying Named Entity Recognition and Disambiguation Web Extraction Tools. In: European chapter of the Association for Computational Linguistics (EACL'12), Avignon, France.
![Page 17: MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd](https://reader033.fdocuments.net/reader033/viewer/2022051323/549843beb47959514d8b546a/html5/thumbnails/17.jpg)
Media Finder Architecture
Media items harvesting using the Media Server http://eventmedia.eurecom.fr/media-
server/search/{combined}/{term} https://github.com/vuknje/media-server (@tomayac fork)
Image near de-duplication DCT signature on image and video frame,
Hamming distance between image pairs
Clustering and disambiguation Named Entity Extraction using NERD Topic Generation using LDA
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 17
![Page 18: MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd](https://reader033.fdocuments.net/reader033/viewer/2022051323/549843beb47959514d8b546a/html5/thumbnails/18.jpg)
Media Finder (named entities clustering)
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 18
![Page 19: MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd](https://reader033.fdocuments.net/reader033/viewer/2022051323/549843beb47959514d8b546a/html5/thumbnails/19.jpg)
Media Finder (zooming in a cluster)
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 19
![Page 20: MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd](https://reader033.fdocuments.net/reader033/viewer/2022051323/549843beb47959514d8b546a/html5/thumbnails/20.jpg)
Media Finder
Live Topic Generation from Event Streams Meet us at WWW 2013 Demo Session http://www.youtube.com/watch?v=8iRiwz7cDYY
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 20
![Page 21: MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd](https://reader033.fdocuments.net/reader033/viewer/2022051323/549843beb47959514d8b546a/html5/thumbnails/21.jpg)
Tracking an event: Italian Election
Repeated queries over a period of time We have tracked and analyzed media posts tagged as
elezioni2013 from 2013-02-26 to 2013-03-03 Cron job: every 30 minutes over the 6 days Slice the data in 24 hours slots
Research questions: Can we re-create the news headlines?
Storyboarding: http://mediafinder.eurecom.fr/story/elezioni2013
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 21
![Page 22: MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd](https://reader033.fdocuments.net/reader033/viewer/2022051323/549843beb47959514d8b546a/html5/thumbnails/22.jpg)
Tracking an event: Italian Election
Dataset: ~16501 microposts containing (duplicate) media items ~21087 Named Entities extracted
Clustering NER and LDA Generate Bag of Entities (BOE) disambiguated with a
DBpedia URI
Examples: Monti, Bersani, Italia, Berlusconi, Grillo, Stelle
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 22
![Page 23: MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd](https://reader033.fdocuments.net/reader033/viewer/2022051323/549843beb47959514d8b546a/html5/thumbnails/23.jpg)
Tracking an event: Italian Election
Tracking and Analyzing The 2013 Italian Election To appear at ESWC 2013 Demo Session http://www.youtube.com/watch?v=jIMdnwMoWnk
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 23
![Page 24: MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd](https://reader033.fdocuments.net/reader033/viewer/2022051323/549843beb47959514d8b546a/html5/thumbnails/24.jpg)
Take Home Message
Media Server / Media Finder: Aggregating fresh social media items Making sense of media collection for video hyper-linking
NERD platform for extracting key information
Vision: adoption of semantic multimedia technologies will foster a European market for media fragment re-purposing and re-selling
Sneak preview: Interact with a Kinect and discover enriched hypervideo http://www.youtube.com/watch?v=4mSC685AG7k
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 24
![Page 25: MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd](https://reader033.fdocuments.net/reader033/viewer/2022051323/549843beb47959514d8b546a/html5/thumbnails/25.jpg)
Credits
Vuk Milicic … interaction designer
Giuseppe Rizzo … NERD guru
José Luis Redondo Garcia … triplification and clustering
Thomas Steiner … Media Server original code
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 25
![Page 26: MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd](https://reader033.fdocuments.net/reader033/viewer/2022051323/549843beb47959514d8b546a/html5/thumbnails/26.jpg)
http://www.slideshare.net/troncy
14/05/2013 Real-Time Analysis and Mining of Social Streams (RAMSS) - Rio de Janeiro - 26