Introduction to Machine Learning: An Application to Disaster Response

32
Introduction to Machine Learning: An Application to Disaster Response Muhammad Imran & Shafiq Joty Qatar Computing Research Institute Hamad Bin Khalifa University Doha, Qatar

Transcript of Introduction to Machine Learning: An Application to Disaster Response

Page 1: Introduction to Machine Learning: An Application to Disaster Response

Introduction to Machine Learning:An Application to Disaster Response

Muhammad Imran & Shafiq JotyQatar Computing Research Institute

Hamad Bin Khalifa UniversityDoha, Qatar

Page 2: Introduction to Machine Learning: An Application to Disaster Response

DISASTERS - SOCIAL MEDIA – RESPONSE EFFORTS

Humans suffering from the impacts of disasters, crises, and armed conflicts.

In the last two decades, 218 million people each year were affected by disasters;At an annual cost to the global economy that exceeds $300 billion. (Source: UN)

Page 3: Introduction to Machine Learning: An Application to Disaster Response

@NYGovCuomo orders closing of NYC bridges. Only Staten Island bridges

unaffected at this time. Bridges must close by 7pm. #Sandy #NYC.

rt @911buff: public help needed: 2 boys 2 & 4 missing nearly 24 hours after they

got separated from their mom when car submerged in si. #sandy #911buff

freaking out. home alone. will just watch tv #Sandy #NYC.

400 Volunteers are needed for areas that #Sandy destroyed.

SANDY HURRICANE TWEETS

Page 4: Introduction to Machine Learning: An Application to Disaster Response

@NYGovCuomo orders closing of NYC bridges. Only Staten Island bridges

unaffected at this time. Bridges must close by 7pm. #Sandy #NYC.

rt @911buff: public help needed: 2 boys 2 & 4 missing nearly 24 hours after they

got separated from their mom when car submerged in si. #sandy #911buff

freaking out. home alone. will just watch tv #Sandy #NYC.

400 Volunteers are needed for areas that #Sandy destroyed.

Personal

Informative

SANDY HURRICANE TWEETS

Page 5: Introduction to Machine Learning: An Application to Disaster Response

@NYGovCuomo orders closing of NYC bridges. Only Staten Island bridges

unaffected at this time. Bridges must close by 7pm. #Sandy #NYC.

rt @911buff: public help needed: 2 boys 2 & 4 missing nearly 24 hours after they

got separated from their mom when car submerged in si. #sandy #911buff

freaking out. home alone. will just watch tv #Sandy #NYC.

400 Volunteers are needed for areas that #Sandy destroyed.

Personal

Informative

Caution and Advice

Reports of missing people

Help/volunteers needed

SANDY HURRICANE TWEETS

Page 6: Introduction to Machine Learning: An Application to Disaster Response

@NYGovCuomo orders closing of NYC bridges. Only Staten Island bridges

unaffected at this time. Bridges must close by 7pm. #Sandy #NYC.

rt @911buff: public help needed: 2 boys 2 & 4 missing nearly 24 hours after they

got separated from their mom when car submerged in si. #sandy #911buff

freaking out. home alone. will just watch tv #Sandy #NYC.

400 Volunteers are needed for areas that #Sandy destroyed.

Personal

Informative

Caution and Advice

Reports of missing people

Help/volunteers needed

SANDY HURRICANE TWEETS

Page 7: Introduction to Machine Learning: An Application to Disaster Response

Personal

Informative (Direct & Indirect)

Other

Caution and advice

Casualties and damage

Donations

People missing, found, or seen

Information source

Siren heard, warning issued/lifted etc.

People dead, injured, damage etc.

Money, shelter, blood, goods, or services

Webpages, photos, videos information sources

FINDING TACTICAL AND ACTIONABLE INFORMATION

Page 8: Introduction to Machine Learning: An Application to Disaster Response

USEFUL INFORMATION ON TWITTERCaution and advice

Information source

Donations

Causalities & damage

A siren heard

Tornado warning issued/lifted

Tornado sighting/touchdown

42%

50%30%

12%

18%Photos as info. source

Webpages info. source

Videos as info. source

44%

20%

16%

Other donations

Money

Equipment, shelter, Volunteers, Blood

38%

8%

54%

People injured

People dead

Damage

44%

44%

2%

16%

10%

% of informative tweetsRef: “Extracting Information Nuggets from Disaster-Related Messages in Social Media”. Imran et al. ISCRAM-2013, Baden-Baden, Germany.

Page 9: Introduction to Machine Learning: An Application to Disaster Response

INFORMATION PROCESSING PIPELINE (SUPERVISED LEARNING): OFFLINE APPROACH

Data collection

1 2Human annotationson sample data

Machine training

3Classification

4

Disaster Timeline:

DATA COLLECTION

Page 10: Introduction to Machine Learning: An Application to Disaster Response

IMPACT AND RESPONSE TIMELINE

Department of Community Safety, Queensland Govt. & UNOCHA, 2011

Disaster response (today) Disaster response (target)

Target disaster response requires real-time processing of data.

Page 11: Introduction to Machine Learning: An Application to Disaster Response

TIME-CRITICAL ANLYSIS OF BIG CRISIS DATA

Apply machine learningApply crowdsourcing

Page 12: Introduction to Machine Learning: An Application to Disaster Response

REQUIREMENS & CHALLENGES

• Real-time analysis of data is required• For rapid crisis response• To reduce community harm

• Combine human and machine intelligence• Usable and useful for end-users (mostly non-technical)• End-users (stakeholders)• Crisis managers (policy makers)• Crisis responders (field workers)

Page 13: Introduction to Machine Learning: An Application to Disaster Response

REQUIREMENS & CHALLENGES

Other key challenges:• Volume

Scale of data (20m tweets in 5 days)• Velocity

Analysis of streaming data (16k/min)• Variety

Different forms/types of data (information types)• Veracity

Uncertainty of data

Page 14: Introduction to Machine Learning: An Application to Disaster Response

STREAM PROCESSING USING SUPERVISED ML

Combining human and machine computation

Quality assurance loops: human processing elementsdo the work, automatic processing elements check forconsistency

Process-verify: work is done automatically, humanscheck low-confidence or borderline cases

Online supervised learning: humans train the machineto do the work automatically

Page 15: Introduction to Machine Learning: An Application to Disaster Response

Data collection

1 2Human annotations Machine training

3Classification

4

ONLINE APPROACH

DATA COLLECTION

HA

Learning-1

CLASSIFICATION OF DATA & DECISION MAKING PROCESS

Learning-2 Learning-3 … Learning-n

Human annotation - 1

Human annotation - 2

Human annotation - 3 … Human

annotation - n

First few hours

INFORMATION PROCESSING PIPELINE: ONLINE APPROACH (REAL-TIME)

Page 16: Introduction to Machine Learning: An Application to Disaster Response

http://aidr.qcri.org/

AIDR —Artificial Intelligence for Disaster Response— is a free, open-source, and easy-to-use platform to automatically filter and classify relevant tweets posted during humanitarian crises.

1 2 3

Collect Curate Classify

Page 17: Introduction to Machine Learning: An Application to Disaster Response

AIDR: FROM END-USERS PERSPECTIVE

Collection Classifier(s)

• Keywords, hashtags• Geographical bounding box• Languages• Follow specific set of users

A collection is a set of filters A classifier is a set of tags• Donations requests & offers• Damage & causalities• Eyewitness accounts• …

2 step approach1 2

http://aidr.qcri.org/

Page 18: Introduction to Machine Learning: An Application to Disaster Response

AIDR APPROACH

Collection Classifier(s)

Tag Tag

Tag Tag

Learner

Classifier-1

Tag

Tag Tag Tag

30k/min

Classifier-2

http://aidr.qcri.org/

Page 19: Introduction to Machine Learning: An Application to Disaster Response

AIDR: HIGH-LEVEL ARCHTECTURE

http://aidr.qcri.org/

Page 20: Introduction to Machine Learning: An Application to Disaster Response

QUALITY VS. COST

http://aidr.qcri.org/

• Gaining acceptable quality• Quality (classification accuracy)• Cost (human labels: monetary in case of paid-workers, time in

case of volunteers)

Quality vs. cost using passive learning Quality vs. cost using active learning

Page 21: Introduction to Machine Learning: An Application to Disaster Response

PERFORMANCE

http://aidr.qcri.org/

• In terms of throughput and latency

Throughput of feature extractor, classifier, and the system

Latency of feature extractor, classifier, and the system

Page 22: Introduction to Machine Learning: An Application to Disaster Response

CHALLENGES: DOMAIN ADAPTATION

http://aidr.qcri.org/

• Crisis-specific labels are necessary• Contrasting vocabulary use• Differences in public concerns, affected infrastructure• New labels should be collected for each new crisis

[ Imran et al. 2013b ]

• Domain adaptation• Train models using all past labeled data (all types of events)• Train on labeled data from past similar events• Train on data from neighboring countries on similar events

Page 23: Introduction to Machine Learning: An Application to Disaster Response

AIDR – COLLECTION SETUPCollection detail dashboard

http://aidr.qcri.org/

Geographical region filterLanguage filter

Collection definition

Page 24: Introduction to Machine Learning: An Application to Disaster Response

http://aidr.qcri.org/

AIDR – CLASSIFIER SETUP

Page 25: Introduction to Machine Learning: An Application to Disaster Response

AIDR – CLASSIFIER SETUP (cont.)

http://aidr.qcri.org/

Page 26: Introduction to Machine Learning: An Application to Disaster Response

AIDR – CROWDSOURCING-1Internal Tagging Interface

http://aidr.qcri.org/

Page 27: Introduction to Machine Learning: An Application to Disaster Response

AIDR – CROWDSOURCING-2MicroMapper Interface (browser clicker)

http://aidr.qcri.org/

Mobile clicker

Page 28: Introduction to Machine Learning: An Application to Disaster Response

AIDR – OUTPUT

http://aidr.qcri.org/

Training examples Classified output (achieved accuracy ~ 75%)

Page 29: Introduction to Machine Learning: An Application to Disaster Response

- Killed 27 people- A million evacuated- $114 million of damage

TYPHOON HAGUPIT (2014)

Page 30: Introduction to Machine Learning: An Application to Disaster Response

DEMOhttp://aidr.qcri.org/

Page 31: Introduction to Machine Learning: An Application to Disaster Response

AIDR has been awarded the Grand Prize in the Open Source Software World Challenge 2015

Page 32: Introduction to Machine Learning: An Application to Disaster Response

http://aidr.qcri.org/

AIDR —Artificial Intelligence for Disaster Response— is a free, open-source, and easy-to-use platform to automatically filter and classify relevant tweets posted during humanitarian crises.

Thank you!