Suicide Bombing Forecaster – Novel Techniques...

Suicide Bombing Forecaster – Novel Techniques to Predict Patterns of Suicide Bombing in Pakistan

Zeeshan ul Hassan UsmaniInteractive Group

[email protected]

Sarah IrumInteractive Group

[email protected]

Saad QadeerLUMS

[email protected]

Taimur QureshiInteractive Group

[email protected]

Keywords: Reality Mining, Terrorism Forecast, Pattern Matching, Data Mining, Big Data

AbstractTerrorist activities (suicide bombings, IEDs etc.) have plagued countries like Pakistan, Iraq, and Afghanistan for number of years. Majority of these human and smart bombs can take place at any time and place giving little or no chance for law enforcement agencies to respond or deploy any pro-active measures. While law enforcement agencies employ various defensive measures in order to prevent these incidents such as deployment of forces, check points, surveillance and proactive intelligence but still the bombings are increasing with every passing day. An effective proactive measure can be to predict the occurrence of such events in advance so that the law enforcement agencies can have prior clue and deploy preemptive measures around the danger zones at specific times of the year. This paper presents SB Forecaster - an advanced warning and mitigation system that uses predictive and pattern analysis to aid the agencies.

1. INTRODUCTION A suicide attack can be defined as a politically motivated

and violent-intended action, with prior intent, by one or more individuals who choose to take their own life while causing maximum damage to the chosen target. Suicide bombing has become one of the most lethal, unpredictable and favorite modus operandi of terrorist organizations. Though only 3% of all terrorist attacks around the world can be classified as suicide bombing attacks, these account for 48% of the casualties [1].

The challenging mission is to prevent terrorism. The difficulties to prevent terrorism are suicide bomber death (thus leaving no traces), cheap equipment used in suicide bombing as it is easy to acquire, organizations which recruit suicide bombers take local people for suicide bombing. Even suicide bombers characteristics changes from men to women, in some cases, children which makes the identification of suicide bomber difficult.

Despite of all the hurdles being faced to put an end to the terrorism, different techniques can be used to exploit the patterns in the behavior of terrorist organizations. These patterns can be identified by using the historical data and other statistical measures.

The focus of this research is the use of data mining algorithms to unveil the suicide bombing patterns in Pakistan. In this paper, prediction techniques are presented which identify high risk areas, next terrorist attack, and terrorist organizations through injury patterns.

2. BACKGROUND

This section shows first the upcoming threats of terrorist organizations. Secondly, it shows data mining techniques for analyzing data. Thirdly, it presents the basic information about the database used in our analysis. Finally, it provides techniques for analysis such as GIS Maps.

2.1. Terrorist OrganizationsThere are various terrorist organizations in Pakistan which are playing their part in suicide bombings. First, Tehreek-e-Taliban Pakistan (TTP) that is comprised of 40 militant commanders with a collective strength of about 25,000 and is considered as the most lethal of the Taliban outfits in Pakistan’s wily re-gions bordering Afghanistan. Other terrorist or-ganizations include Lashkar-e-Jhangvi, Abdullah Uzam Brigade, Masud Group, Karwan e Nemat-ullah, Militant Commander Molvi Nazir, Al Qaeda Taliban Linked, Tehrek e Taliban Punjab wing. Percentages of suicide attacks from 1995 to 2011carried out by these groups are shown in Fig. 1.

mailto:[email protected]




Figure 1. Percentage of Suicide Attacks by Groups (1995-2011)

2.2. Data Source

The complete dataset used in this research is made available at a public portal (www.PakistanBodyCount.org). Research is conducted for the collection of historical terrorist events and then compiled on PBC. Data is collected from media reports, hospitals, and internet. All the gathered data along with analysis of suicide bombings and drone attacks since 1995 to date is publicly available on PBC. Analysis of suicide bombings since 1995 to date is available on PBC. Other sources like PIPS [3] and SATP [4] also contain dataset of suicide attacks in Pakistan. It reported total suicide bombing victims in terrorist attacks in Pakistan as 5229 causalities and 13661 injuries. Total suicide bombing incidents reported by PBC are shown in Fig. 2. This data is taken for analysis of suicide bombing attacks. Analysis was conducted by using tools like C# and ArcGIS.

Figure 2. Suicide Bombings in Pakistan (1995-2012)

2.3. GIS MapsGeographical Information System (GIS) helps to analyze,

interpret, understand and visualize data. Prediction of Suicide bombing patterns across Pakistan is done by GIS Maps. These maps visualize high risk cities and risk factor of all cities of Pakistan. Data is analyzed based on the collected data in data collection phase and then result is shown onto the two dimensional geographical view. This allows presenting the results of sophisticated analysis in better way.

3. STATE-OF-THE-ART

Law enforcement agencies have the need to stay a step ahead of terrorists and thus, need to continuously predict their activities and the target locations. While defensive measures can provide a first layer of protection to terrorist attacks they cannot rely entirely on those means and need to be more effective. Employing proactive measures using new predictive technologies to anticipate the actions of the terrorists provides another effective line of defense for the law enforcement agencies. In this paper, we describe new predictive techniques for the time and location of future attacks and current threat zones.

In the predictive modeling problem that we use for bombing incidents, we use data from past incidents and any derived information about terrorists and the events to predict the locations of future attacks. Many approaches in the literature perform this type of predictive analysis. A lot of them use simple spatial clustering methods using only the coordinates, dates, and types of crimes. Such methods include the Spatial and Temporal Analysis of Crime program (STAC) [5]. In [6], Jefferis et al survey additional hotspot methods that employ kernel density estimation and other simpler density estimation models.

The paper by Brown et al [7] extends crime-clustering methods by incorporating offenders’ preferences in crime site selection. A number of researchers have investigated spatial decision-making by criminals [8, 9, 10, 11, and 12]. To summarize, this body of research suggests that the likelihood of a criminal incident at a specified location is based on past incidents of the same type and independent spatial features [7].

4. DATA PREPARATION

Collection of data plays a role throughout the complete process of generating terror forecasts, ranging from data collection to generation of likelihood functions to presentation of the forecasts. We categorized the data preparation into following three categories.

4.1. Data Extraction

Information about the all incidents; suicide bombings, planted bombs, drone attacks, and other possible disturbance including firing, killings are publically

http://www.PakistanBodyCount.org/

available. All the information is collected from printed media, electronic media and internet. All information regarding terrorist events are gathered using different means, e.g. saving clippings, saving internet data.

Possible attributes of data to find risk value of each city are presented in Table 1. Risk value of city is based on possible suicide bombing risk and disturbance. Public data is used to unveil the sense of achievement of terrorists. When media tells the people about damage, that broadcasted fear and terror is something that terrorist take as success. Using data that is publically available means that we can plot what terrorist are trying to achieve. To identify terrorist pattern or behavior public data is used.

TABLE I. DATA ATTRIBUTES

No. Variables1 INCIDENTDATE2 YEAR 3 DAY 4 MONTH 5 ISLAMIC_MONTH 6 BLAST_DAY_TYPE 7 HOLIDAY 8 TIME 9 HOURS_BEFORE_LAST_BLAST 10 CITY_COORDINATES_LATITUDE11 CITY_COORDINATES_LONGITUDE 12 PROVINCE 13 WEATHER 14 LAST_BLAST_TYPE 15 RALLY_TYPE

4.2. Planners Psychology

Terrorists plans to spread terror and they use every possible ways to create disturbance. There plans can be broadly categorize into two categories, “routine actions” and “reactions”. Routine action; planned attacks which they execute according to set plans. Plans include whole lot of workings; recruitment of bomber, transportation of explosives, making of explosive jacket/vehicle, planting of bomber near target place. Such planning takes good time from start till end and they keep executing such plans in routine. Reactions; are attacks that terrorist organizations do in aggression as counter attack. Whenever counter terrorism agencies attack heads of terrorism organizations, they observe such reactions.

Usually routine actions are hard to foil then the aggressive reactions, because routine actions are planned at best level to avoid capture of any lead. On contrary aggressive action have room for error because of less planning then routine actions. Routine actions leave there own action patterns that can be observed. Which means if the pattern is known in advance then routine actions is avoidable.

5. PATTERNS OF SUICIDE ATTACKS OCCURRED IN PAKISTAN

We attempt to discover a pattern in the timing of the suicide attacks that have occurred in Pakistan. This work is along the lines done by Johnson et al for Afghanistan and Iraq [1].

In general, the act of learning causes the time taken for completing a particular task to decrease. For a suicide attack, we let the time interval between successive attack days stand as the time required to perform the attack. We therefore hypothesize that this time interval follows the general rule

τn = τ1n-b (1)Where τn is the number of days between the nth and the

(n+1)th suicide attack day and τ1 and b are constants for a particular group. Simplifying yields

log(τn) = log(τ1) – b(log(n)) (2)

We thus plot log(τn) vs log(n) and fit a best-fit straight line in order to verify the suitability of this model and to estimate τ1 and b. The next figure shows the best-fit plot for the best-fit values of log(τ1) and b for the different regions.

Fig. 1. Best fit plot for different regions

At the 5% significance level, a two-tailed correlation test for sample size 7 shows that there is significant correlation between the variables. We conclude from this that the pattern of learning with time is exhibited by organizations throughout the country.

6. PREDICTION ALGORITHMS

The goal is to predict the threat level or likelihood of a bombing at any point in time and date of a certain region given its location. All the other attributes mentioned in the previous section are calculated automatically from this basic date/time and location coordinate information. These in turn form the input feature set (X) comprising of 15 variables of table 1.

The predictive analysis of bombing incidents is carried out using two different techniques, exhibiting approximately similar performance in terms of accuracy but different in their order of time complexity, which is discussed in the next section. The general idea for both techniques is the same,

which consists of predicting the output (Y) given a set of N input features (X). Here, the output (Y = positive) consists of a single class meaning that the data set contains examples where a certain bombing incident has positively occurred.

Thus, we can treat this as a one-class classification problem (OCC) [18] as we did in our DTS technique of section 3.1 or implement considering a density estimation method as discussed in section 3.2.

6.1. Distance based Threat Scoring Technique (DTS)

In this technique, we use a distance based scoring measure (as in 3.1.1) in order to classify the level of threat as high, medium or low. The concept here based on the proximity of the new unseen feature vector to the averages of existing bombing incidents. The more similar it is, the likelihood of another bombing incident will be higher. This concept has also been deployed in outlier analysis techniques [19].

In order to increase the accuracy, we identify high-density regions where the bombing incidents are concentrated and assign them to clusters as explained in section 3.1.2. Next, we calculate the centroid of each cluster as:

Now, consider a new point in the feature space. We calculate its distance from the nearest centroid as normalized distance given by:

Thus, we obtain normalized distance scores in the range of 0-100, which are classified as being high (above 70), medium (40-70) and low (below 40) levels of threat.

6.2. Distance Measure

Due to the presence of numerical, ratio, ordered and unordered categorical variables we choose a variation of the gower distance measure as our proximity score. For numeric data we use mahalonobis distance which is defined as:

For categorical values the following formulas are used.

UnorderedIf Values of attribute is different then: d(x, y) = 1

Else: d(x,y) = 0 Ordered

Ordered attributes were first normalized and then the distance was calculated using:

Where, Range = MaxValue (X) – MinValue (X)

In order to calculate distance between to locations defined by its latitude and longitude coordinates we used the haversine formula defined as:

R = earth’s radius (mean radius = 6,371km)lat = lat2− lat1long = long2− long1a = sin²(lat/2) + cos(lat1).cos(lat2).sin²(long/2)c = 2.atan2(√a, √(1−a))Distance = R.c

6.3. Clustering

The clustering algorithm used is kmeans clustering [16], with the distance measure described above. The general algorithm takes K random points in the feature space and measures the distance of all other points from them. The points nearest of these initial points form K initial clusters and their centroids are calculated. The second iteration realigns the clusters and new centroids are discovered. This process continues until no further changes in the clusters are obtained or after a fixed number of iterations.

We use 40 iterations and choose the value of k from 2 to 10. We choose the best K by maximizing the Bayesian Information Criterion (BIC) [17]:

BIC (C | X) = L (X | C) - (p / 2) * log n ;

where L (X | C) is the log-likelihood of the dataset X according to model C, p is the number of parameters in the model C, and n is the number of points in the dataset.

Figure 1: Visualization of Clustering Results

The above figure shows visualization of suicide bombing attacks in the past 5 years defined by 15 features using 3 clusters. We have used Principle Component Analysis [15] as a dimensionality reduction technique and displayed the result using the best two components.

6.4. Threat Likelihood Prediction using Density Estimation (TLP)

This technique uses KNN based non-parametric density estimation in order to predict the likelihood of a bombing incident given input features. It is a well-known fact that density estimation methods suffer from the curse of dimensionality [14]. In order to avoid this problem we elect certain features from the entire input feature space that play more significant role in predicting the outcome. This feature selection method is described in the following section.

6.5. Feature Selection:

The objective of this method is to select a target feature set p from a much larger initial feature set m. The selection of target features is based on a selection procedure that ranks the features according to their relevance to the prediction task in hand. The selection procedure uses a selection criterion that is based on cohesiveness of points or events defined by a set of features. We search for the features that maximize these cohesion criteria as done in [7]. The selection criteria used in [7] is as follows.

Let be the distance between two events i and j in the feature subspace defined by the feature subset to be evaluated. We transform the distance into the similarity

as follows:

=

Where and d is the average inter-event distance, where distance refers to differences in value of an independent variable. [7] defines the Gini index between these two events as:

For a data set of n events, the averaged Gini index below is a suitable measure of cohesiveness:

The smaller the value of the index is, the higher the level of point-pattern cohesiveness or the better the set of features that define the point pattern. In general, Ig can be used in a subset selection algorithm (e.g., forward selection backward elimination) to yield an optimal or suboptimal subset of features.

6.6. KNN Density Estimation:

Once, the desired features are selected, we apply the KNN density estimation technique [13]. Since KNN is non parametric, it can do estimation for arbitrary distributions. Instead of using hypercube and kernel functions, here we do the estimation as follows – For estimating the density at a point x, place a hypercube centered at x and keep increasing its size till k neighbors are captured. Now estimate the density using the formula,

Where n is the total number of V is the volume of the hypercube. Notice that the numerator is essentially a constant and the density is influenced by the volume. The idea is finding k points very quickly near high-density regions. This means the volume of hypercube is small and the resultant density is high. Lets say the density around a point x is very low. Then the volume of the hypercube needed to encompass k nearest neighbors is large and consequently, the ratio is low. Thus, p(x) gives the likelihood of a bombing event.

7. PREDICTION TECHNIQUES

In proposed solution, prediction of suicide attacks are categorized into four categories such as, high risk areas modeling, prediction of future terrorist attack, prediction of terrorist organizations through injury patterns, and visualization of high risk areas through Geo spatial referencing. These categories are explained in this section.

7.1. High Risk Areas Modeling

Several techniques exist for crime prediction including Rossmo’s formula. It gives the point of origin of a serial criminal. Rossmo’s formula divides the map of a crime scene into grid with i rows and j columns. Then, the probability that the criminal is located in the box at row i and column j is

(2)

where f = g = 1:2, k is a scaling constant (so that P is a probability function), T is the total number of crimes, Ø puts more weight on one metric than the other, and B is the radius of the buffer zone (and is suggested to be one-half the mean of the nearest neighbor distance between crimes). [2] Rossmo's formula incorporates two important ideas:

1. Criminals won't travel too far to commit their crimes. This is known as distance decay.

2. There is a buffer area around the criminal's residence where the crimes are less likely to be committed.

Rossmo’s formula does not fit in this model because terrorist organization has different key factors for any terrorist activity.

Sensitive Areas are highlighted on a simple model. Terrorist organizations can target defense bases and settlements, foreign diplomats, political and religious rivals, civilian clusters, psychologically sensitive points, and high value equipment and facilities. On the contrary terrorist organizations are deterred by high security and large distances to the target. Keeping these facts on the view risk can be measured by following equation:

(3)

Where x and y are coordinates. Using the above equation suicide attacks in cities of Pakistan is plotted in Fig. 5.

Fig. 2. Predicted Suicide Attacks in Cities of Pakistan

Actual Attacks in cities of Pakistan are determined by the data collected in data preparation part. Fig. 6 illustrates the actual attacks in cities of Pakistan.

Fig. 3. Actual Suicide Attacks in Cities of Pakistan

Comparison of both the figures (Fig 5 and Fig 6) depicts that actual attacks occurred were exactly in the similar cities as predicted ensuring the reliability and validity of the data and system developed.

7.2. Prediction of Future Terrorist Attack

This prediction technique is about devising alert and mitigation system that allows generating the list of specific cities with high risk on specific dates in future. This system is based on past incidents that are collected in data preparation phase. The selected algorithm is applied on historic data to generate high risk cities list. This system is developed in C# using data mining techniques such as gower algorithm. High risk areas for the date of 3 rd May 2011 are shown in Fig. 7.

Fig. 4. High Risk Areas on May 3rd, 2011

In Fig. 7, Cities with different risk level are shown. Red color indicates high probability of attack in a city, yellow indicates cities with medium risk probability, and green indicates cities with no risk probability on a certain date.

7.3. Prediction of Terrorist Organizations through Injury Patterns

On the basis of medical reports collected in data preparation phase, different injury patterns are identified which indicates different terrorist organizations. These injury patterns are identified by using data mining techniques. In Fig. 8, injury patterns of terrorist organizations of BLA (Balochistan Liberation Army) and LEJ (Lashkar e Jhangvi) are shown.

Fig. 5. Injury patterns of BLA and LEJ

As shown in Fig. 8, injury patterns of BLA and LEJ are different. In BLA attacks, 38% of injuries are on abdominal part of the body. In LEJ, human head suffers more injuries which is approximately 16% of total injuries.

7.4. Visualization of High Risk Areas through Geo Referencing

Graphical presentation on Map helps to analyze the situation visually. In Fig. 9, Fig. 10 and Fig. 11 attack patterns are clearly seen.

Fig. 6. Risk Values plotted on map



Terrorist organizations follow a proper pattern to attempt suicide attacks as shown in above figures. As indicated in red color attack patterns starts from Upper-dir and reached Islamabad by covering all cities between them.

8. EVALUATION

Based on incidents occurred in Pakistan since 1995 following results are generated for a week of April 2011. Cities with repeated patterns suffered terrorist attack in April 2011. Mathematical models and algorithms are used to get closest results. For instance Dara-Adam-Khel is repeated in below mentioned table and had a blast on 1st

April 2011. In the table below it can be clearly seen that Dara-Adam-Khel is shown in high risk area for the whole week where as the actual attack was occurred on 1 st day of the week only.

The next Fig shows number of attacks, cities of attack and damage that was caused due to suicide attacks from Jan 2011 to Jul 2011.

Fig. 9. Statistics and Damage of Suicide attacks from Jan 2011 to Jul 2011

TABLE II. RESULT OF WEEK OF APRIL

Date (April, 2011)

Cities

1 Hangu, Charsadda, Dara Adam Khel, Kohat

2 Hangu,Peshawar,Noshehra,Charsadda, Mardan ,Dara Adam Khel, Malakand,Swabi,Kohat,Lakki marwat

3 Peshawar, Noshehra,Mardan, Dara Adam Khel, Malakand,Swabi,Kohat,Lakki marwat

4 Hangu,Bannu,Peshawar,Noshehra,Charsadda,Mardan,Dara Adam Khel, Malakand,Swabi,Kohat

… …

7 Hangu, Peshawar, Dara Adam Khel,Kohat

According to the statistics total 19 attacks occurred from Jan 2011 to Jul 2011 out of which 15 attacks were predicted while 4 were missed by the presented technique. Total accuracy of the presented technique for forecasting suicide bombing attacks is 78.94%.

9. CONCLUSION AND FUTURE WORK

Technology combined with human intelligence turns out to be the most powerful weapon of present times. In order to make the best use of technology and to reap maximum benefit out of it, all we have to do is “trust it” and “use it” in right way. In order to achieve more effectiveness with the system we plan to predict incidents happening in a 1 km square radius. We are also in the process of gathering more variables describing the locations such as its proximity to important places, properties of locations/building hit by previous bombings, political events that occurred as a prelude to the attack etc.

References

[1] Pape, R. A., “Dying to Win: The Strategic Logic of Suicide Terrorism”, Random House, 2005

[2] www.Pakistanbodycount.org [3] http://www.pips.org.pk/ [4] http://www.satp.org/ [5] Jefferis, E. (1998). “A multi-method exploration of crime hot

spots”. Presentation at the Annual Meeting of the Academy of Criminal Justice Sciences, Albuquerque, NM, March 10–14, 1998

[6] Block, C. (1995). “STAC hot-spot areas: A statistical tool for law enforcement decisions”. In Block, C. R., Dabdoub, M., & Fregly, S. (Eds.), Crime analysis through computer mapping. Washington, DC: Police Executive Research Forum, p. 20036

[7] Hua Liua, Donald E. Brown (2003). “Criminal incident prediction using a point-pattern-based density model”. International Journal of Forecasting 19 (2003) 603–622

[8] Amir, M. (1971). “Patterns in forcible rape”. Chicago: University of Chicago Press

[9] Baldwin, J., & Bottoms, A. (1976). “The urban criminal: A study in Sheffield”. London: Tavistock Publications

[10] Brantingham, P., & Brantingham, P. (1984). “Patterns in crime”. New York: Macmillan Publishing

[11] LeBeau, J. L. (1987). “The journey to rape: Geographic distance and the rapist’s methods of approaching the victim”. Journal of Police Science and Administration, 15, 129–136

[12] Levine, N. (1998). ‘‘Hot Spot’ analysis using CrimeStat kernel density interpolation”. Presentation at the Annual Meeting of the Academy of Criminal Justice Sciences, Albuquerque, NM, March 10–14, 1998.

[13] K. Fukunaga and L.D. Hostetler. “Optimization of k-nearest neighbor density estimates”. IEEE Transactions on Information Theory, 19:320–326, 1973.

[14]Richard Ernest Bellman (2003). “Dynamic Programming”. Courier Dover Publications. ISBN 978-0-486-42809-3.

[15]Abdi. H., & Williams, L.J. (2010). "Principal component analysis". Wiley Interdisciplinary Reviews: Computational Statistics, 2: 433–459.

[16]Kmeans: Lloyd., S. P. (1982). "Least squares quantization in PCM". IEEE Transactions on Information Theory

[17] G. Schwarz, “Estimating the dimension of a model”, The Annals of Statistics, vol. 6, pp 461-464, 1978.

[18] Tax, D. “One class classification”. PhD Thesis, Delft University of Technology (2001)

[19] K.S. Killourhy and R.A. Maxion, "Comparing Anomaly-Detection Algorithms for Keystroke Dynamics." Proc. Int. Conf. Dependable Systems & Networks (DSN-09)

http://en.wikipedia.org/wiki/IEEE_Transactions_on_Information_Theory

http://www.cs.toronto.edu/~roweis/csc2515-2006/readings/lloyd57.pdf

http://www.cs.toronto.edu/~roweis/csc2515-2006/readings/lloyd57.pdf

http://en.wikipedia.org/w/index.php?title=AbdiWilliams&action=edit&redlink=1

http://en.wikipedia.org/wiki/Special:BookSources/978-0-486-42809-3

http://en.wikipedia.org/wiki/International_Standard_Book_Number

http://books.google.com/books?id=fyVtp3EMxasC

http://www.satp.org/

http://www.pips.org.pk/

http://www.Pakistanbodycount.org/

Suicide Bombing Forecaster – Novel Techniques...

Documents

Transcript of Suicide Bombing Forecaster – Novel Techniques...