Coordinating Human and Machine Intelligence to Classify Microblog Communica0ons in Crises
-
Upload
muhammad-imran -
Category
Technology
-
view
281 -
download
1
description
Transcript of Coordinating Human and Machine Intelligence to Classify Microblog Communica0ons in Crises
Muhammad Imran, Carlos Cas)llo, Ji Lucas, Patrick Meier, Jakob Rogstadius
Qatar Compu0ng Research Ins0tute (QCRI) Doha, Qatar
Coordina0ng Human and Machine Intelligence to Classify Microblog
Communica0ons in Crises
USEFUL INFORMATION ON TWITTER Cau0on and advice
Informa0on source
Dona0ons
Causali0es & damage
A siren heard
Tornado warning issued/li>ed
Tornado sigh)ng/touchdown
42%
50% 30%
12%
18% Photos as info. source
Webpages info. source
Videos as info. source
44%
20%
16%
Other dona)ons
Money
Equipment, shelter, Volunteers, Blood
38%
8%
54%
People injured
People dead
Damage
44%
44%
2%
16%
10%
% of informa0ve tweets Ref: “Extrac-ng Informa-on Nuggets from Disaster-‐Related Messages in Social Media”. Imran et al. ISCRAM-‐2013, Baden-‐Baden, Germany.
SOCIAL MEDIA INFORMATION PROCESSING: OFFLINE APPROACH
Data collec)on
1 2 Human annota)ons on sample data
Machine training
3 Classifica)on
4
Disaster Timeline:
DATA COLLECTION
IMPACT AND RESPONSE TIMELINE
Source: Department of Community Safety, Queensland Govt. 2011 & UNOCHA
Disaster response (today) Disaster response (target)
Target disaster response requires real-‐0me processing.
REAL-‐TIME SOCIAL MEDIA ANALYSIS
Key requirements:
• Real-‐0me data collec)on • Capable to incorporate new data collec0on strategies
• Obtain human-‐labels in real-‐0me • Perform de-‐duplica0on
• Perform almost online machine learning • Con)nuous learning • Learn as new labels arrive
• Perform real-‐0me classifica0on • Scale with big disasters (Sandy 15k posts/min)
Data collec)on
1 2 Human annota)ons Machine training
3 Classifica)on
4
ONLINE APPROACH
DATA COLLECTION
HA
Learning-‐1
CLASSIFICATION
Learning-‐2 Learning-‐3 … Learning-‐n
Human annota)on -‐ 1
Human annota)on -‐ 2
Human annota)on -‐ 3 … Human
annota)on -‐ n
First few hours
SOCIAL MEDIA INFORMATION PROCESSING: ONLINE APPROACH (REAL-‐TIME)
hdp://aidr.qcri.org/
AIDR —Ar)ficial Intelligence for Disaster Response— is a free, open-‐source, and easy-‐to-‐use plagorm to automa)cally filter and classify relevant tweets posted during humanitarian crises.
1 2 3
Collect Curate Classify
AIDR: FROM END-‐USERS PERSPECTIVE
Collec0on Classifier(s)
• Keywords, Hashtags • Geographical bounding box • Languages • Follow specific set of users
A collec0on is a set of filters A classifier is a set of tags • Dona0ons requests & offers • Damage & causali0es • Eyewitness accounts
2 step approach 1 2
hdp://aidr.qcri.org/
REAL-‐TIME CLASSIFICATION IN AIDR
Collec0on Classifier(s)
Tag Tag
Tag Tag
Learner
Classifier-‐1
Tag
Tag Tag Tag
30k/min
Classifier-‐2
hdp://aidr.qcri.org/
Tag Tag Tag
Labe
ling task
Model
HUMAN ANNOTATION: CHALLENGES
hdp://aidr.qcri.org/
• Crisis-‐specific labels are necessary • Contras)ng vocabulary • Differences in public concerns, affected infrastructure • New labels should be collected for each new crisis
1-‐ Labeling task selec0on 2-‐ Labeling task scheduling • Which tasks to pick? • No duplicate tasks should be labeled • Priori0ze tasks that are likely to
increase accuracy
• All-‐at-‐once labeling • Gradual labeling • Independent labeling
Crowdsourcing is a big research topic. We address two challenges here:
[ Imran et al. 2013b ]
DATASETS
hdp://aidr.qcri.org/
1. Joplin-‐2011 • Consists of 206,764 tweets collected using (#joplin)
2. Sandy-‐2012 • Consists of 4,906,521 tweets collected using (#sandy, hurricane sandy, …)
3. Oklahoma-‐2013 • Consists of 2,742,588 tweets collected using (Oklahoma, tornado, …)
DISASTER PHASES & # OF TWEETS
hdp://aidr.qcri.org/
Pre: preparedness phase Impact: phase corresponds to the period in which the main effects are felt Post: corresponds to response and recovery phase
Joplin (leL), Sandy (center), and Oklahoma (right). Number of tweets per day in all datasets.
LABELING TASK SELECTION
hdp://aidr.qcri.org/
Experiment: Are crisis-‐specific labels necessary?
Manual labeling (using Crowdflower)
Train Test AUC
Joplin Sandy 0.52
Joplin Oklahoma 0.56
Sandy Oklahoma 0.53
Dataset Phase-‐S1 Phase-‐S2 Phase-‐S3 Phase-‐S4
Joplin 2,000 1,000 1,000 1,000
Sandy 2,000 1,000 1,000 1,000
Oklahoma 2,000 1,000 1,000 N/A
Classifica0on accuracy in various transfer scenarios
* AUC 0.5 represents a random classifier
LABELING TASK SELECTION
hdp://aidr.qcri.org/
Experiment: Is de-‐duplica0on necessary?
Phase Train Phase Test AUC (without de-‐duplica0on)
AUC (with de-‐duplica0on)
S1 (pre) 1,500 S1 (pre) 500 0.78 0.74
S1 (pre) 500 S1 (pre) 500 0.73 0.72
S2 (impact) 500 S2 (impact) 500 0.80 0.72
S3 (post) 500 S3 (post) 500 0.79 0.73
S4 (post’) 500 S4 (post’) 500 0.70 0.64
• 29-‐74% of tweets are re-‐tweets & 60-‐75% are near duplicates • Duplica)on causes an ar0ficial increase in accuracy • Necessary to reduce classifier bias. Otherwise learning on a fewer concepts • Necessary to improve workers experience
[ Rogstadius et al. 2011 ]
LABELING TASK SELECTION
hdp://aidr.qcri.org/
Experiment: Which approach Passive vs. Ac0ve learning?
JOPLIN
SANDY
OKLAHOMA
S1 S2 S3 S4
LABELING TASK SELECTION
hdp://aidr.qcri.org/
• Are crisis-‐specific labels necessary? [YES] • Is de-‐duplica0on necessary? [YES] • Which approach to follow Passive vs. Ac0ve learning? [Ac0ve learning]
Now we know WHICH tasks to select. But we s0ll don’t know WHEN to label them?
LABELING TASK SCHEDULING
hdp://aidr.qcri.org/
• All-‐at-‐once labeling • Obtain 1,500 labels on S1 and use all for training
• Cumula0ve labeling
• Obtain 500 labels in each of S1, S2, and S3 and train on labels available up to each phase
• Independent labeling • Obtain 500 labels in each of S1, S2, and S3 and use the most recent labels for training, discarding old.
LABELING TASK SCHEDULING Experiment: Which labeling strategy to follow?
JOPLIN
SANDY
OKLAHOMA
Informa0ve Informa0ve (50%) Dona0ons
CONCLUSION & FUTURE WORK
hdp://aidr.qcri.org/
• Adap0ve collec0on • Post-‐processing/filtering • More features and learning schemes
• Task selec0on • De-‐duplica)on is necessary • Ac)ve learning approach must be employed
• Task scheduling • All-‐at-‐once for small-‐scale crises
• Incremental for medium-‐scale crises (needs tests)
Future work:
hdp://aidr.qcri.org/
AIDR —Ar)ficial Intelligence for Disaster Response— is a free, open-‐source, and easy-‐to-‐use plagorm to automa)cally filter and classify relevant tweets posted during humanitarian crises.
Thank you!