Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
description
Transcript of Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
![Page 1: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/1.jpg)
DelftUniversity ofTechnology
Semantics + Filtering + Search = TwitcidentExploring Information in Social Web StreamsHypertext 2012, Milwaukee, WI – June 28
Fabian Abel, Claudia Hauff, Geert-Jan Houben, Richard Stronkman, Ke Tao
Web Information Systems, TU Delft, the Netherlands
![Page 2: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/2.jpg)
2Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
200,000,000number of tweets published per day
![Page 3: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/3.jpg)
3
Pukkelpop 2011
People tweet about everything,
everywhere :-)
![Page 4: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/4.jpg)
4
Pukkelpop 2011
81,000 tweets in four hours
became a tragedy
Filtering
200,000,000
Search & Analytics
Useful tweets?
![Page 5: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/5.jpg)
5
Case NijmegenTrain accident
![Page 6: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/6.jpg)
6Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
First tweet…
And then your train blasts off full of the anvils. #Nijmegen #veolia
![Page 7: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/7.jpg)
7Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
First picture…
Astonishing! My train rams the platform at Nijmegen!
http://pic.twitter.com/QVVfJHyd
![Page 8: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/8.jpg)
8Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Traditional news media
A train ramed the anvils at Nijmegen.
![Page 9: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/9.jpg)
9Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
1. (Automatic) Filtering: Given an incident, how can one automatically identify those tweets that are relevant to the incident?
2. Search & Analytics: How can one improve search and analytical capabilities so that users can explore information in the streams of tweets?
Twitter streams
Research Challenges
Filtering
topic
Search & Analytics
information need
![Page 10: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/10.jpg)
10Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Twitcident Pipeline
Automatic Filtering
Search & Analytics
![Page 11: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/11.jpg)
11Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Twitcident system
![Page 12: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/12.jpg)
12Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Twitcident Pipeline
Automatic Filtering
Search & Analytics
![Page 13: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/13.jpg)
13Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Incident detection
• Twiticident relies on Emergency Broadcasting Services for detecting incidents.
• In the Netherlands : P2000 communication network
![Page 14: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/14.jpg)
14Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Incident Profiling• For an incident i:
• The profile of an incident is described as a set of tuples.
• Each tuple includes a facet-value pair (f, v) and its weight to the incident i.
Location, Netherlands
0.4
Incident,Train
accident0.5
Location, Nijmegen
0.8
Orgranization,Veolia
0.6
Incident,Crash
1.0
![Page 15: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/15.jpg)
15Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Twitcident Pipeline
Automatic Filtering
Search & Analytics
![Page 16: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/16.jpg)
16Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Social Media Aggregation • Collecting Twitter messages, pictures, and videos from Social Media Platforms e.g. Twitter, PhotoBucket, Vimeo
![Page 17: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/17.jpg)
17Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Twitcident Pipeline
Automatic Filtering
Search & Analytics
![Page 18: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/18.jpg)
18Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Semantic Enrichment
• Named Entity Recognition
• Classification : Casualties, Damages, Risks…
• Linkage : External Resources
• Metadata extraction
![Page 19: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/19.jpg)
19Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Twitcident Pipeline
Automatic Filtering
Search & Analytics
![Page 20: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/20.jpg)
20Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Filtering
•Which tweets are relevant to the incidents?
• Preprocessing : Language detection
• Semantic Filtering : Compare tweet with P(i)
• Semantic Filtering with News Context• P’(i) : P(i) complemented with f-v pairs from
news
![Page 21: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/21.jpg)
21Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Twitcident Pipeline
Automatic Filtering
Search & Analytics
![Page 22: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/22.jpg)
22Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Faceted Search
•Strategies (ranking)
• Frequency-based
• Time-sensitive based
• Personalized
![Page 23: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/23.jpg)
23Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Real-time analyticsWhat type of things are mentioned in the tweets?
What aspects are mentioned over time? What do people report about over time?
Impact Area
![Page 24: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/24.jpg)
24Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Evaluation - Dataset
• Twitter corpus ( TREC Microblog Track 2011 ) • 16 million tweets (Jan. 24th – Feb. 8th, 2011 )• 4,766,901 tweets classified as English• 6.2 million entity-extractions
• News (Same time period)• 62 RSS News Feeds• 13,959 News Articles• 357,559 entity-extractions
![Page 25: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/25.jpg)
25Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
EvaluationFor tweets Filtering (1/2)
Semantic strategies outperform the keyword-based filtering regarding all metrics.
![Page 26: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/26.jpg)
26Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
EvaluationFor tweets Filtering (2/2)
The semantic strategy is more robust and achieves higher precisions for complex topics.
![Page 27: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/27.jpg)
27Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
EvaluationFor Faceted Search (1/2)
The semantic faceted search strategy improves the search performance by 34.8% and 22.4%.
![Page 28: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/28.jpg)
28Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
EvaluationFor Faceted Search (2/2)
The strategies with semantic enrichment outperform the strategy without semantic enrichment in predicting the appropriate facet-values.
![Page 29: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/29.jpg)
29Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Conclusions
• What we have done:
• Twitcident, a framework for filtering, searching, and
analyzing information about incidents that people
publish in their Social Web Streams
• What we have achieved:
• Better filtering of Twitter messages for a given incident.
• Better search for relevant information about an incident
within the filtered messages.
![Page 30: Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams](https://reader035.fdocuments.net/reader035/viewer/2022062418/55504e64b4c9058f768b530a/html5/thumbnails/30.jpg)
30Semantics + Filtering + Search = Twitcident - Exploring Information in Social Web Streams
Thank you!
Ke Tao @taubau
@wisdelfthttp://twitcident.org