Humanitarian Assistance Ontology Implementation during ... · humanitarian assistance in specific...

12
Humanitarian Assistance Ontology Implementation during Disaster Management in Chennai Flood-2015 Using Text Mining Techniques 1 C. Anbarasi and 2 P. Mayilvahanan 1 Vels University, Pallavaram, Chennai. [email protected] 2 Department of Computer Applications, Vels University, Chennai, Tamil Nadu, India. [email protected] Abstract A disaster management plan for the city is in the works, following alarm over a series of earthquakes that have recently occurred in Nepal, and the tremors felt in various parts of the country, flood disaster in Chennai 2015. The main task of the research work is being focused on constructing an application using data mining techniques and algorithms during disaster situation The Commissionerate of Revenue Administration in association with Chennai Corporation and Chennai district collectorate will work together on this, according to sources. Chennai is yet to have a comprehensive disaster management plan, which includes predefined roles and responsibilities with specific tasks for each official. The disaster management plan will include detailed mapping of safest escape routes and resources for facilitating rescue and relief operations. The lack of a disaster management plan has previously led to a delay in relief and rescue work after major disasters such as tsunami and flood in the city. We will start collecting data on resources at the ward level. Chennai Corporation has 200 wards covering an area of 426 sq km. Ward level mapping was done after the tsunami. Some of the earlier works pertaining to mapping for disaster preparedness are not relevant after the boundaries of wards and zones changed following expansion of the city, said a disaster International Journal of Pure and Applied Mathematics Volume 116 No. 21 2017, 729-739 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu Special Issue ijpam.eu 729

Transcript of Humanitarian Assistance Ontology Implementation during ... · humanitarian assistance in specific...

Page 1: Humanitarian Assistance Ontology Implementation during ... · humanitarian assistance in specific places to avoid delay and confusion in humanitarian responses in real t ime. The

Humanitarian Assistance Ontology

Implementation during Disaster Management

in Chennai Flood-2015 Using Text Mining

Techniques 1C. Anbarasi and

2P. Mayilvahanan

1Vels University,

Pallavaram, Chennai.

[email protected] 2Department of Computer Applications,

Vels University, Chennai, Tamil Nadu, India.

[email protected]

Abstract A disaster management plan for the city is in the works, following alarm

over a series of earthquakes that have recently occurred in Nepal, and the

tremors felt in various parts of the country, flood disaster in Chennai 2015.

The main task of the research work is being focused on constructing an

application using data mining techniques and algorithms during disaster

situation

The Commissionerate of Revenue Administration in association with

Chennai Corporation and Chennai district collectorate will work together

on this, according to sources. Chennai is yet to have a comprehensive

disaster management plan, which includes predefined roles and

responsibilities with specific tasks for each official. The disaster

management plan will include detailed mapping of safest escape routes

and resources for facilitating rescue and relief operations. The lack of a

disaster management plan has previously led to a delay in relief and rescue

work after major disasters such as tsunami and flood in the city.

We will start collecting data on resources at the ward level. Chennai

Corporation has 200 wards covering an area of 426 sq km. Ward level

mapping was done after the tsunami. Some of the earlier works pertaining

to mapping for disaster preparedness are not relevant after the boundaries

of wards and zones changed following expansion of the city, said a disaster

International Journal of Pure and Applied MathematicsVolume 116 No. 21 2017, 729-739ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version)url: http://www.ijpam.euSpecial Issue ijpam.eu

729

Page 2: Humanitarian Assistance Ontology Implementation during ... · humanitarian assistance in specific places to avoid delay and confusion in humanitarian responses in real t ime. The

management expert who worked on the resilience index for the city. A

Climate Disaster Resilience Index prepared for Chennai based on the data

collected on five elements-physical, social, economic, institutional and

natural-for ten old zones of the Corporation has to be revised for the 15

existing zones.

There is also a need for integrating other local bodies on the outskirts in

the disaster management plan. According to a previous study, the coastline

from Ennore to Kasimedu Fishing Harbour was found to be safer. The

coastline from Cooum River to Kovalam creek was more vulnerable to

disasters such as a tsunami. The plan for disaster management will have a

list of low lying areas, slums, persons with disability, senior citizens,

pregnant ladies, cooks, electricians, power cutting tools, ham radios,

dilapidated buildings, hospitals and schools.

Humanitarian Assistance Ontology in data mining for Crisis Response is

the new beginning for researchers, the purpose of which is to develop an

application tool using ontology concept in data mining. In the existing

methods the data is collected manually. After the data collection, it is sent

to the other team of people for inspection and verification. Once the

inspection and verification process is completed, the report is submitted to

the supervising team of workers. The report is then sent to the decision

makers to appoint human assistance during disaster. So, this type of

execution of work during disaster situation is a tedious procedure. This

procedure depends upon more number of workers.

Recent proposal of OntoDm helps to build a decision support system

during a disaster situation using humanitarian assistance ontology for

decision makers. The proposed ontology named OntoDM, is based on a

general framework for data mining. This includes dentition of basic data

mining concepts like data type and dataset. The disaster online data is

collected from social media which is extremely heterogeneous, both

structurally and semantically, which creates a need for data integration and

ingestion in order to assist the emergency management officials in rapid

disaster recovery whenever disasters occur.

Comments and queries during disaster through social media like twitter,

whats’app, e-mail, facebook, data in blogs are collected and integrated for

analysis and management. In this research work the sample data of

Chennai flood- 2015 is taken for analysis. The research is being conducted

with social conscience, which could help the society to be aware of the

disaster situation. By sharing the situation through social media, many of

us are exposed to current crisis need. It is an application of concept of

Ontology and how it could be useful to share their situation through online

and seek the help of National Disaster Relief Force (NDRF) and State

Disaster Relief Force (SDRF). An attempt to bring out new trends in the

International Journal of Pure and Applied Mathematics Special Issue

730

Page 3: Humanitarian Assistance Ontology Implementation during ... · humanitarian assistance in specific places to avoid delay and confusion in humanitarian responses in real t ime. The

data mining research was a comprehensive study to build a onto-database

to recognize the right information at right time to right people as a

technology in context of crisis response.

Key Words:Disaster management, social media, ontology, classification

algorithm, text mining, word cloud.

1. Introduction

All these data are in many different formats. The solutions can be categorized

into main components that interact with one another, namely to employ disaster

management data spaces. Immediately after a disaster, the data is not in a

structured format. But, by applying data mining concepts and algorithms the

data is automatically pre-processed into useful information for decision makers

in a short time. To pre-process the data, the research work propose a method

which merges logic rules and data mining techniques, to represent the

humanitarian needs.

2. Literature Review

The findings have shown that very minimal data had been analysed from the

necessary data collection during a disaster. Crisis management operations must

be in dynamic nature and not static. There is no proper method towards crisis

management operations. Also, to develop a decision support system for decision

makers during crisis response time to enhance task performance 1. This paper

proposes a graph clustering method on a directed weighted graph network for

detecting communities in social network, based on neighbourhood nodes and

the frequency of the path traversed 2

. The social media attracts the youth to

organize any difficult task and motivate them to be an activist during

emergency. However, research is being carried over to find the best way to

organize and to improve response methods during crisis.3.

Social media

supports wide-scale interaction that can be collectively resourceful, self-

policing, and generative of information that is otherwise hard to obtain 4.

An automatic approach for identifying messages communicated via Twitter

contributes to situational awareness, which alarms the public during

emergencies 5. The Intelligent Disaster Decision Support System (IDDSS) was

recently developed to provide a platform for integrating a vast range of road

network, traffic, geographic, economic and meteorological data as well as

dynamic disaster and transport models 6. This paper attempts to assess the role

of IDSS in decision making. First, it explores the definitions and understanding

of DSS and IDSS. Second, this paper illustrates a framework of IDSS along

with various tools and technologies that support it 7.

3. Proposed Methodology

Once the cluster solutions are arrived, the clusters are further examined for their

International Journal of Pure and Applied Mathematics Special Issue

731

Page 4: Humanitarian Assistance Ontology Implementation during ... · humanitarian assistance in specific places to avoid delay and confusion in humanitarian responses in real t ime. The

right classification. In classification algorithm the data sets are homogeneous,

This research work deals with 300 users water related comments The proposed

method also recommends appropriate humanitarian responses. The main

advantage of the OntoDM is to prioritize humanitarian needs and to identify

humanitarian responses on water related queries automatically. This proposed

method converts unstructured data into structured data. Now, the decision

makers can focus more on implementing the solutions. The method is

implemented on real data from the Chennai Flood 2015 crisis. The use of this

method in the Chennai Flood 2015 crisis , clearly shows the need for

humanitarian assistance in specific places to avoid delay and confusion in

humanitarian responses in real time. The developed application DisManOnto

collects information from social media related to disaster situation. Using

classification algorithm, water-related queries are extracted and stored in a data

base. By applying a set of ontological rules the humanitarian assistance

ontology is recommended to construct a decision support system.

Structure of HAO-System

Figure 1. Structure of a HAO system presents the overall data flow for the

implementation of HAO in identifying the response type during disaster. Data is

collected from the social media using text mining libraries. The unstructured

data is changed to structured input data. Features are extracted from the data.

These features are presented to the input layer of HAO algorithm along with

target values assigned for each cluster analysis. At the end of cluster analysis of

the HAO algorithms, final SPSS-results stored in the database.

International Journal of Pure and Applied Mathematics Special Issue

732

Page 5: Humanitarian Assistance Ontology Implementation during ... · humanitarian assistance in specific places to avoid delay and confusion in humanitarian responses in real t ime. The

Figure 1: Structure of HAO-System

Architecture of HAO with Text Mining Libraries

Figure 2: Architecture of HAO-System

Social Media -Data Query

When a disaster happens, the system will receive a lot of information at once. It

is necessary for the system to select a small portion of entities that a user really

Inductive queries

Data preprocessing

Data extraction

Disaster management data collection from messages, e-mail,

twitter, whatsapp, facebook and blogs, web portal

Unstructured data

structured data

Onto database

Document summarization Spatial clustering

Reports

Classification clustering

International Journal of Pure and Applied Mathematics Special Issue

733

Page 6: Humanitarian Assistance Ontology Implementation during ... · humanitarian assistance in specific places to avoid delay and confusion in humanitarian responses in real t ime. The

cares about to display in the dashboards.

Development Sample

By applying the following statistical method, sample data can be developed into

a useful information.

1. statistical modeling – natural language processing, sentiment analysis, naïve

bays k-means clustering

2. Business rules formulation- strategic rules to promote meaningful insights

3. Statistical tools- R language and SPSS.

Text Mining

Data process and service component retrieves data from the databases, and

performs necessary data process, analytical, or mining functions to generate

response results requested from the web interface e.g., the customer remarks,

information from external sources. The unformatted data input from developed

sample is changed to formatted data using text mining libraries and word cloud.

Decision Making by NLP, Sentiment Analysis

We have developed techniques to facilitate information sharing and

collaboration between both private and public sector participants for major

disaster recovery planning and management. The proposed work is an

integrated environment for exploratory analysis of spatial data that equips an

analyst with a variety of data mining tools and provided the service of

automated mapping of source data and data mining results.

4. Text Mining Techniques Used

Creating a word cloud using R software

Install and Load the Required Packages

Text Mining and Word cloud Packages are Required.

They can be installed and loaded using the R code For eg:install.packages

(“tm”) for text mining, install.packages(“SnowballC”) for text stemming. Load

library(“tm”) and library(“SnowballC”)

Algorithm 1 - Term Extraction Input:

Output: The list N of comments multi-word terms.

Step 1: Collect bigram frequencies for M in a proximity database ontoDB.

Step 2: For all 4-grams s t u v in M, remove one count for

t u in ontoDB if

- mi(t, u) < mi(s, t) or

- mi(t, u) < mi(u, v)

Step 3: For all entries (t, u) in ontoDB, add (t, u) to a list N if :

- C(x, y) > minCount

-S(x, y) >minLogM

Algorithm 2 - Multi-word term extraction algorithm

International Journal of Pure and Applied Mathematics Special Issue

734

Page 7: Humanitarian Assistance Ontology Implementation during ... · humanitarian assistance in specific places to avoid delay and confusion in humanitarian responses in real t ime. The

Input: A list N of two-word c for a corpus M in any language and a proximity

Database ontoDB consisting of bigram frequencies for M.

Output: The list X of extracted multi-word terms.

Step 1: collect features for comments terms For each comment u in N

For each v1 v2 …u …v2i-1 v2i in M

Add all possible substrings involving c in ontoDB.

Step 2: save the proximity database

Remove each entry in ontoDB that has frequency < minFreq.

Step 3: Extend two-word comments into an initially empty list X

For each comment u in N

extend(u, X, ontoDB)

if most occurrences of u in the corpus have not been

extended then add u to X.

Cleaning the text

The tm_map() function is used to remove unnecessary white space, to convert

the text to lower case, to remove common stopwords like „the‟, “we”. The

information value of „stop words‟ is near zero due to the fact that they are so

common in a language. Removing this kind of words is useful before further

analysis. The R code can be used to clean your text:

Algorithm 3 To remove Stopwords

Input – group of sentences

Output – sentences without stopwords

Step 1. Input given

Step 2. Convert the text to lower case

Step 3. Remove numbers

Step 4. Remove english common stopwords

Step 5. Remove your own stop word

Step 6. Remove punctuations

Step 7. Specify your stopwords as a character vector

Step 8. Eliminate extra white spaces

Step 9. Text stemming

Step 10. Sentences without stop words

Algorithm 4 - Data generation in Tabular form Input – Unstructured Information

Output – Structured information

Step 1 – Browsing information

Step 2 – Document Preprocessing Sentence Boundary detection Stop word removal

Stemming. term disambiguation

Step 3 – Term Extraction

Calculate term frequencies.

Calculate percentage

Step 4 – Check for the terms in ontology.

International Journal of Pure and Applied Mathematics Special Issue

735

Page 8: Humanitarian Assistance Ontology Implementation during ... · humanitarian assistance in specific places to avoid delay and confusion in humanitarian responses in real t ime. The

Step 5 - data is stored in the table for the terms that have similarity frequency

5. Implementation

The word cloud and text mining techniques are used in Chennai–flood 2015

data set and the results are generated. The below screen shots shows that the

text mining libraries are implemented successfully. Comments from social

media are captured, clustered, classified, evaluated and reports related to crisis

response is generated.

Figure 3: Disaster Management System Home Page

Figure 4: Database - Chennai Flood 2015

Figure 5: Display of Calculation of Percentage from Various Respondents

6. Future Enhancement

The system evaluation results demonstrate the effectiveness and efficiency of

our proposed approaches. During the system implementation and assessment

process, the users provided suggestions, limitations and possible enhancements.

International Journal of Pure and Applied Mathematics Special Issue

736

Page 9: Humanitarian Assistance Ontology Implementation during ... · humanitarian assistance in specific places to avoid delay and confusion in humanitarian responses in real t ime. The

Our future efforts will be focusing on the following tasks: evaluation with

images and video files. To develop more accurate and fast application tool

which captures the current user‟s comment and provide them with actionable

answers dynamically. The feedback from our users is positive and suggests that

our system can be used to share the valuable actionable information and to

pursue more complex tasks.

7. Conclusion

This research work discusses about the steps of HAO implementation during

disaster management in Chennai flood 2015. The extraction of structured data

from unstructured data conveys the demographic details of the affected people.

From this proposed methodology, an application tool is developed on inductive

queries and comments collected from social media during Chennai flood – dec‟

2015. Using text mining libraries and word cloud the data set is classified and

by using the logical rules the implementation of classification algorithm helps

the data to be structured. From the pool of information, we are able to convert

raw data into meaningful information. This meaningful information helps the

decision makers in speedy crisis response action during a disaster situation.

References [1] Saeed N.A.L., Zakaria N.H., Ahmad M.N., The use of Social Media in

Knowledge Integration for Improving Disaster Emergency Management Task Performance: Review of Flood disasters, Indian Journal of science and technology 9(34) (2016), 1-12.

[2] Parimala M., Lopez D., Kaspar S., K-Neighbourhood Structural Similarity Approach for spatial Clustering, Indian journal of Science and Technology 8(23) (2015), 1-11.

[3] Singla M.L., Apoorv D., How Social Media Gives You Competitive Advantage, Indian 8(4) (2015), 1-6.

[4] Sutton J., Palen L., Shklovski I., Backchannels on the front lines: Emergent uses of social media in the southern California wildfires, Proceedings of the 5th International ISCRAM Conference (2008), 1-9.

[5] Verma S., Natural Language Processing to the Rescue Extracting Situational Awareness, Tweets During Mass Emergency, ICWSM (2011),1-8.

[6] Kaviani A., Thompson R.G., Rajabifard A., Griffin G., Chen Y., A decision support system for improving the management of traffic networks during disasters, Australasian Transport Research Forum (2015).

[7] Tariq A., Rafi K., Intelligent decision support systems-A framework,Information and Knowledge Management 2(6) (2012), 1-9.

[8] Lindsay B., Social Media and Disasters: Current Uses, Future Options, and Policy Considerations, Congressional Research Service (2011).

International Journal of Pure and Applied Mathematics Special Issue

737

Page 10: Humanitarian Assistance Ontology Implementation during ... · humanitarian assistance in specific places to avoid delay and confusion in humanitarian responses in real t ime. The

[9] Bizer C., Lehmann J., Kobilarov G., Auer S., Becker C., Cyganiak R., Hellmann S., DBpedia-A crystallization point for the Web of Data. Web Semantics: science, services and agents on the world wide web 7(3) (2009), 154-165.

[10] Nuno F., Borbinha J., Calado P., An Approach for Named Entity Recognition in Poorly Structured Data, The Semantic Web: Research and Applications (2012), 718–732.

[11] Frantzi K., Ananiadou S., Mima H., Automatic recognition of multi-word terms:the c-value/nc-value method, International Journal on Digital Libraries 3(2) (2000), 115-130.

[12] Harrison C., Jorder M., Stern H., Stavinsky F., Reddy V., Hanson H., Waechter H., Lowe L., Gravano L., Balter S., Using online reviews by restaurant patrons to identify unreported cases of foodborne illness–New York City, 2012-2013, Morbidity and Mortality Weekly Report (MMWR) 63(20) (2014), 441–445.

[13] Kang J.S., Kuznetsova P., Luca M., Choi Y., Where not to eat? Improving public policy by predicting hygiene inspections using online reviews, Empirical Methods in Natural Language Processing (2013), 1443–1448.

[14] Lamb A., Paul M.J., Dredze M., Separating fact from fear: Tracking flu infections on Twitter, Proceedings of NAACL-HLT (2013), 789–795.

[15] Myers S.A., Leskovec J., The bursty dynamics of the Twitter information network, Proceedings of the 23rd International World Wide Web Conference (2014), 913–924.

[16] Naaman M., Becker H., Gravano L., Hip and trendy: Characterizing emerging trends on Twitter, Journal of the American Society for Information Science and Technology 62(5) (2011), 902–918.

[17] Paul M.J., Dredze M., You are what you tweet: Analyzing Twitter for public health, Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media (2011).

[18] Petrovi´c S., Osborne M., Lavrenko V., Streaming first story detection with application to Twitter, Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics (2010), 181–189.

[19] Psallidas F., Becker H., Naaman M., Gravano L., Effective event identification in social media, IEEE Data Engineering Bulletin 36(3) (2013), 42–50.

[20] Magdum P.M., Nandedkar V.S., Developing D-Matrix of Unstructured Text Using Ontology Based Text Mining. International Advanced Research Journal in Science, Engineering and Technology 3(6) (2016), 122-144.

International Journal of Pure and Applied Mathematics Special Issue

738

Page 11: Humanitarian Assistance Ontology Implementation during ... · humanitarian assistance in specific places to avoid delay and confusion in humanitarian responses in real t ime. The

739

Page 12: Humanitarian Assistance Ontology Implementation during ... · humanitarian assistance in specific places to avoid delay and confusion in humanitarian responses in real t ime. The

740