AN EFFICIENT BIO-INSPIRED ALGORITHM BASED DATA ...ijaema.com/gallery/92-november-2838.pdf · AN...

AN EFFICIENT BIO-INSPIRED ALGORITHM BASED DATA

CLASSIFICATION MODEL FOR INTRUSION DETECTION IN

MOBILE ADHOC NETWORKS

1S. Murugan,

2Dr. M. Jeyakarthic

1Assistant Professor, Thiru Vi Ka Govt Arts College, Thiruvarur.

2Assistant Director (Academic), Tamil Virtual Academy, Anna University Campus, Chennai-

600025.

[email protected] , [email protected]

Abstract

Presently, mobile ad hoc networks (MANET) show its applicability in several domains due to

the self-configurable nature of independent mobile nodes. The nature of wireless links

employed for communication in MANET poses a major challenge for security. So, it

becomes essential to introduce an intrusion detection system (IDS) for MANET. In this

paper, Fitness-Scaling Chaotic Genetic Ant Colony Algorithm (FSCGACA) based classifier

technique is presented to detect intrusions in MANET. The projected technique undergoes

validation using benchmark KDD Cup 99 dataset. When the comparison of the presented

model takes place with the existing methods, the FSCGACA model offered supreme results

with the sensitivity of 97.02, specificity of 99.39, accuracy of 98.09, F-score of 98.24 and

kappa value of 96.16 respectively.

Keywords: Classification; Intrusion Detection; MANET; Genetic Algorithm

1. Introduction

Mobile Ad hoc NETwork (MANET) includes of a number of mobile nodes function by

wireless transceiver to interrelate through one another straight or using irregular nodes.

Recently, wireless networks develop into further ordinary in manufacturing applications to

accessing and managing devices from remote areas [1]. Important advantages of MANET are

the nature of allowing information communication between several nodes still there is a

mobile. Comparing for the conventional wireless networks, MANET maintains decentralized

network infrastructure, and the nodes could generously goes in an random method. It can

expand an independent network no necessity of some centralized infrastructure. The nature of

small pattern and quicker consumption of nodes in MANET creates it probable for used that

The International journal of analytical and experimental modal analysis

Volume XI, Issue XI, November/2019

ISSN NO: 0886-9367

Page No:834

are extremely unfeasible in serious mission applications such as disaster relief or military

processes. Due to these distinctive features, MANET becomes extra popular and to use in

varied applications. Simultaneously, about the actuality which MANET is well suitable to

different crucial applications, network securities are of more essential. In case, due to the lack

of the node‟s physical security, the attacker‟s may simply cooperate the nodes. Especially,

getting into account which the MANET routing protocols believed which every node in the

network controlled in a circulated way and supposed the nodes is non malicious, attackers

cooperation MANET with insertion malicious nodes keen on the network. Moreover, because

of the intrinsic features of MANET‟s circulated framework and unstable network topology, a

classical centralized observing concept became ineffective in MANET. In those conditions,

there is an extreme requirement of developing an Intrusion Detection System (IDS)

particularly extended to MANET.

IDS in MANET are further challenging over wired and static networks because of the

difficulty lies in fulfilling to requirement of IDS (ability for collect audit information and

utilize IDS techniques for recognize the intrusions through small false rate and remedial

perform for the recognized intrusions), and the intrinsic characteristics of MANET generate

operational and execution difficulties. Several other complexities to scheming IDS to

MANET are listed here. MANET have no centralized point for observe, and audit

information collecting could happen. MANET routing methods require the node for work

jointly and function as routers, creating further probability to attacks. Nature of dynamic

network topologies creates the intrusion detection method more complex. As the nodes in

MANET is extremely constrained on the developing capability, it is hard to design IDS for

MANET. On the other hand, ID could be utilized as a procedure for denote possible security

breakdown in the network. A extra natural manner for action ID techniques are to utilize a

classification model to confirm whether an amount of examined traffic information‟s are

„„normal‟‟ or „„abnormal‟‟. Of course, the classification method intends to reduce the fault

possibility. Although several techniques are rule dependent concepts and anomaly detection

concepts, this study mostly focus on the classification concepts. Particularly, this paper

examines how the classification techniques get valuable to IDS in MANET.

A significant solution for propose the IDS to the modern intrusions are self-learning concepts,

that are one of the valuable and proactive manners employs machine learning models. It



ISSN NO: 0886-9367

Page No:835

makes use of unorganized and organized techniques for recognize as well as categorize the

identified and indefinite security intrusions that assist in begining desirable performs on

malicious activities at the suitable time. The designs of efficient machine learning (ML)

techniques to concurrent IDS are still at the infant phase. Although numerous solutions are

resulting, the usability in concurrent platforms is not completely successful. A lot of the

accessible techniques outcome in high false alarms and calculation difficulty. Importantly,

ML have developed to a next level called as deep learning (DL) that could be signified as a

combined form of ML techniques. DL concept has the ability of learning the theoretical

design of compound hierarchical characteristics globally by series information of TCP/IP

packets. In current days, DL became further successful due to the important outcomes

obtained in the area of natural language developing, speech processing etc. The primary

motives are which the DL technique is interconnected for the 2 major characteristics such as

hierarchical characteristic illustrations and learning durable dependency of temporal designs

in larger scale sequential data. The taxonomy of existing works of low as well asDL

technique is afforded.

In network security, one of the mature fields is Intrusion detection. Although there is exist

numerous possible methods such as rule based systems and anomalous detection systems, our

work certainly focused on classification techniques. To detect of intrusions, classification

technique has been broadly utilized and to wired system. A limited count of attempts have

been prepared to classification dependent intrusion detection in wireless ad hoc networks. To

ad hoc networks, IDS are obtainable in [2]. In wireless ad hoc networks, they projected

supportive and circulated anomaly dependent ID that presents a practical guide to IDS

forming. Based on the routing informs, they aspire at anomaly detection techniques on Media

Access Control (MAC) layer and mobile application layer. For undertaking resource

limitations which every MANET handles, [3] delayed the past work with analytical a cluster

based IDS. For classify "unusual" and "usual" performance, statistical characteristic set is

utilized that could be removed from the routing table and decision tree induction techniques

are use to classification. If the identified attack occurs in one hop, the estimated technique

could identify the source of the attacker. Two types of allocated ID techniques namely

hierarchical and completely distributed framework have been estimated [4]. ID technique

utilized in this framework aim above the network layer and based on Support Vector Machine

(SVM). It create the use of parameter set removed from network layer and suggested which



ISSN NO: 0886-9367

Page No:836

the hierarchically distributed method might be confident solution if evaluated by a completely

distributed technique. Liu et al. [5] estimated a entire distributed anomaly detection technique

that creates use of MAC layer information for store the nature of mobile nodes and then

utilized examination over cross feature the characteristic vectors formed from the training

information.

Supportive and distributed IDS are existing [6] which employs MAC, routing and application

layers, jointly by a Bayesian classification model. [7] utilized an ensemble of classifiers

expanded with training several C4.5 classifiers and observe it above MANET network to 2

attacks types. In MANET, to accomplish intrusion detection, Distance based Outlier

Detection (CPDOD) and Conformal Predictor k-nearest neighbor is the manner is utilized for

detection Sinkhole attacks [8]. Costs sensitive classification was accomplished by a k-nearest

neighbor (KNN) classifier for calculate possibilities. Numerous reported studies used the

KDD Cup 99 dataset to ID issues.

The nature of wireless links employed for communication in MANET poses a major

challenge for security. So, it becomes essential to introduce an intrusion detection system

(IDS) for MANET. In this paper, fitness scaling chaotic genetic ant colony algorithm

(FSCGACA) based classifier technique is presented to detect intrusions in MANET. The

projected technique undergoes validation using benchmark KDD Cup 99 dataset. When the

comparison of the presented model takes place with the existing methods, the FSCGACA

model offered supreme results over in diverse dimensions.

2. FSCGACA based classification model for IDS

The presented model is operated in two stages. In the beginning, preprocessing takes place

where data normalization and non-numerical features to numerical one transformation is

done. Then, FSCGACA technique gets executed for classifying the data.

The process involved in GACA is shown in Fig. 1. Let us consider the classification issue

have class in the dimensional design space and there is vectors [ ] (𝑖

= 1, 2. . . ). Models of if-then rules are signified in Eq. (1).

𝑖

𝑖 𝑖 𝑖 𝑖



ISSN NO: 0886-9367

Page No:837

where are the number of elements and and is the lowest and highest bounds of

the zth

element , correspondingly. To applied the database, a sample rules are signified in

Eq. (2)

Fig. 1. Process involved in GACA

The GACA technique, that is a combination of GA and ACA, are developed in 2 dimensions:

Primarily, the suitable scaling changes the actual suitable measures from the suitable function

for a range of values in a adaptable radius of chosen function [9] that creates the use of scaled

suitable values to choose bees of the future generation. This function allots a higher option of

bee chosen through high scaled values [10]. The four widely to use suitable scaling technique

is selection and is provided in Table. From these suitable scaling methods, the power scaling

type gives a solution fast due to the development in diversity, but it requires stability.

Besides, the rank scaling describes the stability on several test types [11]. So, new power-

rank scaling techniques are obtainable with the grouping of power and rank method as

follows:

( )

∑

where are the rank of xth

individual ant and are the number of population. The process

includes of 3 sub-processes:



ISSN NO: 0886-9367

Page No:838

At first, each ant are assembled to achieve the respective ranks

Then, power is evaluated to exponential values z.

Next, the scaled values undertake normalization with separating the total of the scaled

values above the total population.

Chaos theories are described with the supposed butterfly cause extended with Lorenz [12].

Dependent on weather system widely, Lorenz discovered which little differences in the early

situations steered following reproductions towards drastically different ending action,

representing long-term to predict is not possible in common [13]. Sensitive dependencies on

early situation is not only observed in difficult concept also in easy logistic equation. The

familiar logistic equations are described as

where and { } The chaotic series utilized for generate the

mutation process. In summary, the specified stages of the FSCGACA explained under.

Step 1 (Parameter settings): Choose the population size , crossover possibility , mutation

possibility , elite chosen possibility , trail level factor , attractiveness factor ,

pheromone disappearance coefficient , pheromone constant , power scale factor , early

logistic point , and set iteration epoch z = 0.

Step 2(initialization): Create possible solutions [ ] arbitarily. The particular

suitable values [1, 2, . . . , ] is described arbitrarily. Their equivalent suitable values

[ ]is confirmed and scaled with Eq. (3) as [ ].

Step 3(elitist selection): Select the optimal 𝐸∗ individuals to changing the equal number of

bad individuals.

Step 4 (crossover): Choose ∗ individuals and one point crossover occurs, i.e. choose the

locus randomly, every information separately from locus as also parents is exchanged and the

resulting individual is the child.



ISSN NO: 0886-9367

Page No:839

Step 5(mutation): Choose ∗ iindividuals and action regular mutation. It change the value

of chosen individuals through the chaotic number generated with Eq. (4) and matches it for

the user particular superior and inferior bounds.

Step 6 (Evaluation): the new individual is signified as . Their individual suitable values

[ ] is validated and scaled with Eq. (3) as [

]

Step 7 (update): If [ ] [

], and

also

Step 8:

Step 9: If 𝐸 , jump to stage 3, else go to next stage

Step 10 (transform): Creates 10×10×⋅⋅⋅×10 graph dependent on the difficulty accuracy.

Individual is exchanged to ants. The route relevant to the values of individuals is

issued with pheromones 𝜏by the quantity of [ ]. Heuristic function values 𝜂is described

to be identical as the scaled suitable values [ ].

Step 11(path selection): To every ant, select the new path with

∑

(5)

Step 12(pheromone update): The trail is informed with the under formula

{ [ ]

(6)

Step 13: z = z + 1.

Step 14: If the stopping criterion are satisfy go to next step

Step 15: Select and show the optimal route ∗

With the over stages, the presented concept divides the input of information to recognize

whether the network includes intrusion.



ISSN NO: 0886-9367

Page No:840

3. Performance Validation

Dataset used

The action of the obtainable IDS concepts are displayed with KDD Cup 1999database [14].

The database holds a group of 125973 samples below 2 classes‟ labels such as normal and

abnormal or anomaly. The element present in the database is shown in Table 1.The particular

is given in Table 2.

Table 1 Attributes Description



ISSN NO: 0886-9367

Page No:841



ISSN NO: 0886-9367

Page No:842

Table 2 Dataset Description

Dataset Source # of instances # of attributes # of class Normal/Anomaly

IDS KDD Cup 1999 125973 41 2 67343/58630

Between the groups of 125973 occurrences, a group of 67343 instances drop below the

normal type, and then left over 58630 instances drop below the anomaly type. Furthermore, a

total of 41 elements are present in the applied database. The element values present in the

database is various kinds of values such as numeric, Boolean and symbolic data.

3.2. Results and discussion

Table 3 gives the achieved confusion with FSCGACA together with diverse alive classifier

next to the utilized IDS database. Dependent on the values obtainable in the confusion matrix,

the classifier action could be calculated. From the table 3, it is revealed which the FSCGACA

accurately classifiers a highest of 67000 and 56570 samples as anomaly and normal,

respectively. But, the existing RBF recognizes a total of 62876 and 54187 instances as

normal and anomaly respectively. Also, LR efficiently recognizes a total of 65540 and 56789

instances as normal and anomaly respectively. From these values, it could be examined

which the established FSCGACA demonstrates optimal classification by a highest number of

accurately predicted class evaluated to existing manners.

0.6

2.9

8

7.6

1

6.5

9

3.0

7

2.7

3

6.1

6

7.6

1

4.6

4.3

1

4.6

2

4.3

1

F P R F N R MEASURES

FSCGACA RBFNetwork LR Random Forest Random Tree Decision Tree



ISSN NO: 0886-9367

Page No:843

Fig. 2. FPR and FNR analysis of distinct classifier models on IDS dataset



ISSN NO: 0886-9367

Page No:844

Table 3 Confusion Matrix of Intrusion Detection Dataset using Various Classifiers

Experts

FSCGACA RBFNetwork Logistic Regression Random Forest Random Tree Decision Tree

Normal Anomaly Normal Anomaly Normal Anomaly Normal Anomaly Normal Anomaly Normal Anomaly

Normal 67000 343 62876 4467 65540 1803 63836 3507 64655 2688 64662 2701

Anomaly 2060 56570 4443 54187 1841 56789 5257 53373 2917 55713 2919 55711

Table 4 Performance Evaluation of Various Classifiers on Intrusion Detection System Dataset

Classifier FPR FNR Sensitivity Specificity Accuracy F-score Kappa

FSCGACA 0.60 2.98 97.02 99.39 98.09 98.24 96.16

RBFNetwork 7.61 6.59 93.40 92.38 92.93 93.38 85.79

LR 3.07 2.73 97.26 96.92 97.10 97.29 94.19

Random Forest 6.16 7.61 92.39 93.83 93.04 93.58 85.99

Random Tree 4.60 4.31 95.68 95.39 95.55 95.84 91.06

Decision Tree 4.62 4.31 95.68 95.37 95.53 95.83 91.03



ISSN NO: 0886-9367

Page No:845

For the estimation of the classifier outcomes, the obtainable FSCGACA have undergone a

relationship by various classifiers such as DT, RBF, RF, LR, and RT on the utilized IDS

database. Table 4 gives the achieved classification action with respect to some validation

parameters. If evaluating the outcomes between the existing classifier concepts, the RBF

classifier showed bad action. Fig. 2 demonstrates the outcomes analysis of various classifiers

with respect to FPR and FNR. From the figure, it is obvious which the applied RBF achieves

the highest FPR and FNR value. At the same time, RF also failed to illustrate efficient

outcomes above the evaluated techniques except for RBF. Between the other techniques, the

RT and DT classifiers showed almost the same action to one another. These techniques

attained optimal outcomes than the RBF and RF. But, it fails to illustrate efficient action

above the LR and FSCGACA. These LR are establishing to be efficient with achieved lesser

FPR and FNR values; it demonstrates inferior outcomes above the presented FSCGACA

classifier. Regarding FPR and FNR values, the FSCGACA achieved lowest values of FPR

and FNR of 0.60 and 2.98 correspondingly.

Also, Fig. 3 demonstrates the comparative analysis of diverse techniques below various

measures such as sensitivity, accuracy, specificity, F-score and kappa value. With respect to

sensitivity and specificity, the RBF classifier showed bad outcomes by the lowest values of

93.40 and 92.38 correspondingly. Likewise, the DT and RT illustrate competitive action by

the sensitivity and specificity values of approximately 95.68 and 95.39 correspondingly. In

line with, the LR demonstrates efficient action above the obtainable techniques by the highest

sensitivity and specificity values of 97.26 and 96.92 respectively. But, the presented

FSCGACA classifier achieved highest sensitivity and specificity values of 97.02 and 99.39

correspondingly. Interestingly, on evaluating the outcomes of several techniques with respect

to accuracy, the FSCGACA demonstrates highest accuracy of 98.09 while the evaluated

techniques RBF, LR, RF, RT, and DT achieve an accuracy of 92.93, 97.10, 93.04, 95.55 and

95.53 correspondingly. In continue with, it is represented with the LR revealed capable

outcomes through a maximum F-score of 97.29 above the other techniques. Still, the

presented FSCGACA classifiers are found to be efficient by the highest F-score of 98.24.

Besides, the FSCGACA generates highest predictive action evaluated to other techniques by

the maximum kappa value of 96.16.



ISSN NO: 0886-9367

Page No:846

Fig. 3. Classification performance analysis on IDS dataset

4. Conclusion

The intrinsic features of MANET‟s circulated framework and unstable network topology

makes a classical centralized observing concept became ineffective in MANET. Due to the

lack of the node‟s physical security, the attacker‟s may simply cooperate the nodes. The

nature of wireless links employed for communication in MANET poses a major challenge for

security. So, it becomes essential to introduce an intrusion detection system (IDS) for

MANET. In this paper, fitness-scaling chaotic genetic ant colony algorithm (FSCGACA)

based classifier technique is presented to detect intrusions in MANET. The projected

technique undergoes validation using benchmark KDD Cup 99 dataset. When the comparison

of the presented model takes place with the existing methods, the FSCGACA model offered

supreme results over in diverse dimensions.

References

[1] Y. Kim, “Remote sensing and control of an irrigation system using a distributed

wireless sensor network,” IEEE Trans. Instrum. Meas., vol. 57, no. 7, pp. 1379–1387,

Jul. 2008

[2] Y. Zhang, W. Lee, Y. Huang, ID techniques for mobile wireless networks, Wireless

Networks 9 (5) (2003) 545–556.

75

80

85

90

95

100

Sensitivity Specificity Accuracy F-score Kappa

Measures

FSCGACA RBFNetwork LR Random Forest Random Tree Decision Tree



ISSN NO: 0886-9367

Page No:847

[3] Y. Huang, W. Lee, A cooperative IDS for ad hoc networks, in: Proceedings of the 1st

ACM Workshop on Security of Ad Hoc and Sensor Networks (SASN‟03), Fairfax,

VA, USA, 2003, pp. 135–147

[4] H. Deng, Q. Zeng, D.P. Agrawal, SVM-based IDS for wireless ad hoc networks, in:

Proceedings of the 58th IEEE Vehicular Technology Conference (VTC‟03), vol. 3,

Orlando, FL, USA, 6–9 October 2003, pp. 2147–2151.

[5] Y. Liu, Y. Li, H. Man, Mac layer anomaly detection in ad hoc networks, in:

Proceedings of the 6th Annual IEEE SMC Information Assurance Workshop (IAW

‟05), West Point, NY, USA, 15–17 June 2005, pp. 402–409.

[6] S. Bose, S. Bharathimurugan, A. Kannan, Multi-layer intergraded anomaly ID for

mobile ad hoc networks, in: Proceedings of the IEEE International Conference on

Signal Processing, Communications and Networking (ICSCN 2007), February 2007,

pp. 360–365.

[7] J.B.D. Cabrera, C. Gutiérrez, R.K. Mehra, Ensemble methods for anomaly detection

and distributed ID in mobile adhoc networks, In Information Fusion 9 (2008) 96–119.

[8] W. Shim, G. Kim, S. Kim, A distributed sinkhole detection method using cluster

analysis, Expert Systems with Applications 37 (12) (2010) 8486–8491.

[9] J. Yang, J. Deng, S. Li, and Y. Hao, “Improved traffic detection with support vector

machine based on restricted boltzmann machine,” Soft Computing, vol. 21, no. 11,

pp. 3101–3112, 2017.

[10] Z. Wang, “The applications of DL on traffic identification,” BlackHat USA, 2015.

[11] R. C. Staudemeyer and C. W. Omlin, “Extracting salient features for network ID

using ML methods,” South African computer journal, vol. 52, no. 1, pp. 82–96, 2014

[12] H. Sak, A. Senior, and F. Beaufays, “Long short-term memory recurrent neural

network architectures for large scale acoustic modeling,” in Fifteenth Annual

Conference of the International Speech Communication Association, 2014.

[13] Mazraeh, S., Ghanavati, M., Neysi, S.H.N., 2016. IDS with decision tree and combine

method algorithm. Int. Acad. J. Sci. Eng. 3, 21–31.

[14] KDDCup. (1999).Available

at:http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html



ISSN NO: 0886-9367

Page No:848

AN EFFICIENT BIO-INSPIRED ALGORITHM BASED DATA ...ijaema.com/gallery/92-november-2838.pdf · AN...

Documents

Transcript of AN EFFICIENT BIO-INSPIRED ALGORITHM BASED DATA ...ijaema.com/gallery/92-november-2838.pdf · AN...