AN EFFICIENT BIO-INSPIRED ALGORITHM BASED DATA ...ijaema.com/gallery/92-november-2838.pdf · AN...
Transcript of AN EFFICIENT BIO-INSPIRED ALGORITHM BASED DATA ...ijaema.com/gallery/92-november-2838.pdf · AN...
AN EFFICIENT BIO-INSPIRED ALGORITHM BASED DATA
CLASSIFICATION MODEL FOR INTRUSION DETECTION IN
MOBILE ADHOC NETWORKS
1S. Murugan,
2Dr. M. Jeyakarthic
1Assistant Professor, Thiru Vi Ka Govt Arts College, Thiruvarur.
2Assistant Director (Academic), Tamil Virtual Academy, Anna University Campus, Chennai-
600025.
[email protected] , [email protected]
Abstract
Presently, mobile ad hoc networks (MANET) show its applicability in several domains due to
the self-configurable nature of independent mobile nodes. The nature of wireless links
employed for communication in MANET poses a major challenge for security. So, it
becomes essential to introduce an intrusion detection system (IDS) for MANET. In this
paper, Fitness-Scaling Chaotic Genetic Ant Colony Algorithm (FSCGACA) based classifier
technique is presented to detect intrusions in MANET. The projected technique undergoes
validation using benchmark KDD Cup 99 dataset. When the comparison of the presented
model takes place with the existing methods, the FSCGACA model offered supreme results
with the sensitivity of 97.02, specificity of 99.39, accuracy of 98.09, F-score of 98.24 and
kappa value of 96.16 respectively.
Keywords: Classification; Intrusion Detection; MANET; Genetic Algorithm
1. Introduction
Mobile Ad hoc NETwork (MANET) includes of a number of mobile nodes function by
wireless transceiver to interrelate through one another straight or using irregular nodes.
Recently, wireless networks develop into further ordinary in manufacturing applications to
accessing and managing devices from remote areas [1]. Important advantages of MANET are
the nature of allowing information communication between several nodes still there is a
mobile. Comparing for the conventional wireless networks, MANET maintains decentralized
network infrastructure, and the nodes could generously goes in an random method. It can
expand an independent network no necessity of some centralized infrastructure. The nature of
small pattern and quicker consumption of nodes in MANET creates it probable for used that
The International journal of analytical and experimental modal analysis
Volume XI, Issue XI, November/2019
ISSN NO: 0886-9367
Page No:834
are extremely unfeasible in serious mission applications such as disaster relief or military
processes. Due to these distinctive features, MANET becomes extra popular and to use in
varied applications. Simultaneously, about the actuality which MANET is well suitable to
different crucial applications, network securities are of more essential. In case, due to the lack
of the node‟s physical security, the attacker‟s may simply cooperate the nodes. Especially,
getting into account which the MANET routing protocols believed which every node in the
network controlled in a circulated way and supposed the nodes is non malicious, attackers
cooperation MANET with insertion malicious nodes keen on the network. Moreover, because
of the intrinsic features of MANET‟s circulated framework and unstable network topology, a
classical centralized observing concept became ineffective in MANET. In those conditions,
there is an extreme requirement of developing an Intrusion Detection System (IDS)
particularly extended to MANET.
IDS in MANET are further challenging over wired and static networks because of the
difficulty lies in fulfilling to requirement of IDS (ability for collect audit information and
utilize IDS techniques for recognize the intrusions through small false rate and remedial
perform for the recognized intrusions), and the intrinsic characteristics of MANET generate
operational and execution difficulties. Several other complexities to scheming IDS to
MANET are listed here. MANET have no centralized point for observe, and audit
information collecting could happen. MANET routing methods require the node for work
jointly and function as routers, creating further probability to attacks. Nature of dynamic
network topologies creates the intrusion detection method more complex. As the nodes in
MANET is extremely constrained on the developing capability, it is hard to design IDS for
MANET. On the other hand, ID could be utilized as a procedure for denote possible security
breakdown in the network. A extra natural manner for action ID techniques are to utilize a
classification model to confirm whether an amount of examined traffic information‟s are
„„normal‟‟ or „„abnormal‟‟. Of course, the classification method intends to reduce the fault
possibility. Although several techniques are rule dependent concepts and anomaly detection
concepts, this study mostly focus on the classification concepts. Particularly, this paper
examines how the classification techniques get valuable to IDS in MANET.
A significant solution for propose the IDS to the modern intrusions are self-learning concepts,
that are one of the valuable and proactive manners employs machine learning models. It
The International journal of analytical and experimental modal analysis
Volume XI, Issue XI, November/2019
ISSN NO: 0886-9367
Page No:835
makes use of unorganized and organized techniques for recognize as well as categorize the
identified and indefinite security intrusions that assist in begining desirable performs on
malicious activities at the suitable time. The designs of efficient machine learning (ML)
techniques to concurrent IDS are still at the infant phase. Although numerous solutions are
resulting, the usability in concurrent platforms is not completely successful. A lot of the
accessible techniques outcome in high false alarms and calculation difficulty. Importantly,
ML have developed to a next level called as deep learning (DL) that could be signified as a
combined form of ML techniques. DL concept has the ability of learning the theoretical
design of compound hierarchical characteristics globally by series information of TCP/IP
packets. In current days, DL became further successful due to the important outcomes
obtained in the area of natural language developing, speech processing etc. The primary
motives are which the DL technique is interconnected for the 2 major characteristics such as
hierarchical characteristic illustrations and learning durable dependency of temporal designs
in larger scale sequential data. The taxonomy of existing works of low as well asDL
technique is afforded.
In network security, one of the mature fields is Intrusion detection. Although there is exist
numerous possible methods such as rule based systems and anomalous detection systems, our
work certainly focused on classification techniques. To detect of intrusions, classification
technique has been broadly utilized and to wired system. A limited count of attempts have
been prepared to classification dependent intrusion detection in wireless ad hoc networks. To
ad hoc networks, IDS are obtainable in [2]. In wireless ad hoc networks, they projected
supportive and circulated anomaly dependent ID that presents a practical guide to IDS
forming. Based on the routing informs, they aspire at anomaly detection techniques on Media
Access Control (MAC) layer and mobile application layer. For undertaking resource
limitations which every MANET handles, [3] delayed the past work with analytical a cluster
based IDS. For classify "unusual" and "usual" performance, statistical characteristic set is
utilized that could be removed from the routing table and decision tree induction techniques
are use to classification. If the identified attack occurs in one hop, the estimated technique
could identify the source of the attacker. Two types of allocated ID techniques namely
hierarchical and completely distributed framework have been estimated [4]. ID technique
utilized in this framework aim above the network layer and based on Support Vector Machine
(SVM). It create the use of parameter set removed from network layer and suggested which
The International journal of analytical and experimental modal analysis
Volume XI, Issue XI, November/2019
ISSN NO: 0886-9367
Page No:836
the hierarchically distributed method might be confident solution if evaluated by a completely
distributed technique. Liu et al. [5] estimated a entire distributed anomaly detection technique
that creates use of MAC layer information for store the nature of mobile nodes and then
utilized examination over cross feature the characteristic vectors formed from the training
information.
Supportive and distributed IDS are existing [6] which employs MAC, routing and application
layers, jointly by a Bayesian classification model. [7] utilized an ensemble of classifiers
expanded with training several C4.5 classifiers and observe it above MANET network to 2
attacks types. In MANET, to accomplish intrusion detection, Distance based Outlier
Detection (CPDOD) and Conformal Predictor k-nearest neighbor is the manner is utilized for
detection Sinkhole attacks [8]. Costs sensitive classification was accomplished by a k-nearest
neighbor (KNN) classifier for calculate possibilities. Numerous reported studies used the
KDD Cup 99 dataset to ID issues.
The nature of wireless links employed for communication in MANET poses a major
challenge for security. So, it becomes essential to introduce an intrusion detection system
(IDS) for MANET. In this paper, fitness scaling chaotic genetic ant colony algorithm
(FSCGACA) based classifier technique is presented to detect intrusions in MANET. The
projected technique undergoes validation using benchmark KDD Cup 99 dataset. When the
comparison of the presented model takes place with the existing methods, the FSCGACA
model offered supreme results over in diverse dimensions.
2. FSCGACA based classification model for IDS
The presented model is operated in two stages. In the beginning, preprocessing takes place
where data normalization and non-numerical features to numerical one transformation is
done. Then, FSCGACA technique gets executed for classifying the data.
The process involved in GACA is shown in Fig. 1. Let us consider the classification issue
have class in the dimensional design space and there is vectors [ ] (𝑖
= 1, 2. . . ). Models of if-then rules are signified in Eq. (1).
𝑖
𝑖 𝑖 𝑖 𝑖
The International journal of analytical and experimental modal analysis
Volume XI, Issue XI, November/2019
ISSN NO: 0886-9367
Page No:837
where are the number of elements and and is the lowest and highest bounds of
the zth
element , correspondingly. To applied the database, a sample rules are signified in
Eq. (2)
Fig. 1. Process involved in GACA
The GACA technique, that is a combination of GA and ACA, are developed in 2 dimensions:
Primarily, the suitable scaling changes the actual suitable measures from the suitable function
for a range of values in a adaptable radius of chosen function [9] that creates the use of scaled
suitable values to choose bees of the future generation. This function allots a higher option of
bee chosen through high scaled values [10]. The four widely to use suitable scaling technique
is selection and is provided in Table. From these suitable scaling methods, the power scaling
type gives a solution fast due to the development in diversity, but it requires stability.
Besides, the rank scaling describes the stability on several test types [11]. So, new power-
rank scaling techniques are obtainable with the grouping of power and rank method as
follows:
( )
∑
where are the rank of xth
individual ant and are the number of population. The process
includes of 3 sub-processes:
The International journal of analytical and experimental modal analysis
Volume XI, Issue XI, November/2019
ISSN NO: 0886-9367
Page No:838
At first, each ant are assembled to achieve the respective ranks
Then, power is evaluated to exponential values z.
Next, the scaled values undertake normalization with separating the total of the scaled
values above the total population.
Chaos theories are described with the supposed butterfly cause extended with Lorenz [12].
Dependent on weather system widely, Lorenz discovered which little differences in the early
situations steered following reproductions towards drastically different ending action,
representing long-term to predict is not possible in common [13]. Sensitive dependencies on
early situation is not only observed in difficult concept also in easy logistic equation. The
familiar logistic equations are described as
where and { } The chaotic series utilized for generate the
mutation process. In summary, the specified stages of the FSCGACA explained under.
Step 1 (Parameter settings): Choose the population size , crossover possibility , mutation
possibility , elite chosen possibility , trail level factor , attractiveness factor ,
pheromone disappearance coefficient , pheromone constant , power scale factor , early
logistic point , and set iteration epoch z = 0.
Step 2(initialization): Create possible solutions [ ] arbitarily. The particular
suitable values [1, 2, . . . , ] is described arbitrarily. Their equivalent suitable values
[ ]is confirmed and scaled with Eq. (3) as [ ].
Step 3(elitist selection): Select the optimal 𝐸∗ individuals to changing the equal number of
bad individuals.
Step 4 (crossover): Choose ∗ individuals and one point crossover occurs, i.e. choose the
locus randomly, every information separately from locus as also parents is exchanged and the
resulting individual is the child.
The International journal of analytical and experimental modal analysis
Volume XI, Issue XI, November/2019
ISSN NO: 0886-9367
Page No:839
Step 5(mutation): Choose ∗ iindividuals and action regular mutation. It change the value
of chosen individuals through the chaotic number generated with Eq. (4) and matches it for
the user particular superior and inferior bounds.
Step 6 (Evaluation): the new individual is signified as . Their individual suitable values
[ ] is validated and scaled with Eq. (3) as [
]
Step 7 (update): If [ ] [
], and
also
Step 8:
Step 9: If 𝐸 , jump to stage 3, else go to next stage
Step 10 (transform): Creates 10×10×⋅⋅⋅×10 graph dependent on the difficulty accuracy.
Individual is exchanged to ants. The route relevant to the values of individuals is
issued with pheromones 𝜏by the quantity of [ ]. Heuristic function values 𝜂is described
to be identical as the scaled suitable values [ ].
Step 11(path selection): To every ant, select the new path with
∑
(5)
Step 12(pheromone update): The trail is informed with the under formula
{ [ ]
(6)
Step 13: z = z + 1.
Step 14: If the stopping criterion are satisfy go to next step
Step 15: Select and show the optimal route ∗
With the over stages, the presented concept divides the input of information to recognize
whether the network includes intrusion.
The International journal of analytical and experimental modal analysis
Volume XI, Issue XI, November/2019
ISSN NO: 0886-9367
Page No:840
3. Performance Validation
Dataset used
The action of the obtainable IDS concepts are displayed with KDD Cup 1999database [14].
The database holds a group of 125973 samples below 2 classes‟ labels such as normal and
abnormal or anomaly. The element present in the database is shown in Table 1.The particular
is given in Table 2.
Table 1 Attributes Description
The International journal of analytical and experimental modal analysis
Volume XI, Issue XI, November/2019
ISSN NO: 0886-9367
Page No:841
The International journal of analytical and experimental modal analysis
Volume XI, Issue XI, November/2019
ISSN NO: 0886-9367
Page No:842
Table 2 Dataset Description
Dataset Source # of instances # of attributes # of class Normal/Anomaly
IDS KDD Cup 1999 125973 41 2 67343/58630
Between the groups of 125973 occurrences, a group of 67343 instances drop below the
normal type, and then left over 58630 instances drop below the anomaly type. Furthermore, a
total of 41 elements are present in the applied database. The element values present in the
database is various kinds of values such as numeric, Boolean and symbolic data.
3.2. Results and discussion
Table 3 gives the achieved confusion with FSCGACA together with diverse alive classifier
next to the utilized IDS database. Dependent on the values obtainable in the confusion matrix,
the classifier action could be calculated. From the table 3, it is revealed which the FSCGACA
accurately classifiers a highest of 67000 and 56570 samples as anomaly and normal,
respectively. But, the existing RBF recognizes a total of 62876 and 54187 instances as
normal and anomaly respectively. Also, LR efficiently recognizes a total of 65540 and 56789
instances as normal and anomaly respectively. From these values, it could be examined
which the established FSCGACA demonstrates optimal classification by a highest number of
accurately predicted class evaluated to existing manners.
0.6
2.9
8
7.6
1
6.5
9
3.0
7
2.7
3
6.1
6
7.6
1
4.6
4.3
1
4.6
2
4.3
1
F P R F N R MEASURES
FSCGACA RBFNetwork LR Random Forest Random Tree Decision Tree
The International journal of analytical and experimental modal analysis
Volume XI, Issue XI, November/2019
ISSN NO: 0886-9367
Page No:843
Fig. 2. FPR and FNR analysis of distinct classifier models on IDS dataset
The International journal of analytical and experimental modal analysis
Volume XI, Issue XI, November/2019
ISSN NO: 0886-9367
Page No:844
Table 3 Confusion Matrix of Intrusion Detection Dataset using Various Classifiers
Experts
FSCGACA RBFNetwork Logistic Regression Random Forest Random Tree Decision Tree
Normal Anomaly Normal Anomaly Normal Anomaly Normal Anomaly Normal Anomaly Normal Anomaly
Normal 67000 343 62876 4467 65540 1803 63836 3507 64655 2688 64662 2701
Anomaly 2060 56570 4443 54187 1841 56789 5257 53373 2917 55713 2919 55711
Table 4 Performance Evaluation of Various Classifiers on Intrusion Detection System Dataset
Classifier FPR FNR Sensitivity Specificity Accuracy F-score Kappa
FSCGACA 0.60 2.98 97.02 99.39 98.09 98.24 96.16
RBFNetwork 7.61 6.59 93.40 92.38 92.93 93.38 85.79
LR 3.07 2.73 97.26 96.92 97.10 97.29 94.19
Random Forest 6.16 7.61 92.39 93.83 93.04 93.58 85.99
Random Tree 4.60 4.31 95.68 95.39 95.55 95.84 91.06
Decision Tree 4.62 4.31 95.68 95.37 95.53 95.83 91.03
The International journal of analytical and experimental modal analysis
Volume XI, Issue XI, November/2019
ISSN NO: 0886-9367
Page No:845
For the estimation of the classifier outcomes, the obtainable FSCGACA have undergone a
relationship by various classifiers such as DT, RBF, RF, LR, and RT on the utilized IDS
database. Table 4 gives the achieved classification action with respect to some validation
parameters. If evaluating the outcomes between the existing classifier concepts, the RBF
classifier showed bad action. Fig. 2 demonstrates the outcomes analysis of various classifiers
with respect to FPR and FNR. From the figure, it is obvious which the applied RBF achieves
the highest FPR and FNR value. At the same time, RF also failed to illustrate efficient
outcomes above the evaluated techniques except for RBF. Between the other techniques, the
RT and DT classifiers showed almost the same action to one another. These techniques
attained optimal outcomes than the RBF and RF. But, it fails to illustrate efficient action
above the LR and FSCGACA. These LR are establishing to be efficient with achieved lesser
FPR and FNR values; it demonstrates inferior outcomes above the presented FSCGACA
classifier. Regarding FPR and FNR values, the FSCGACA achieved lowest values of FPR
and FNR of 0.60 and 2.98 correspondingly.
Also, Fig. 3 demonstrates the comparative analysis of diverse techniques below various
measures such as sensitivity, accuracy, specificity, F-score and kappa value. With respect to
sensitivity and specificity, the RBF classifier showed bad outcomes by the lowest values of
93.40 and 92.38 correspondingly. Likewise, the DT and RT illustrate competitive action by
the sensitivity and specificity values of approximately 95.68 and 95.39 correspondingly. In
line with, the LR demonstrates efficient action above the obtainable techniques by the highest
sensitivity and specificity values of 97.26 and 96.92 respectively. But, the presented
FSCGACA classifier achieved highest sensitivity and specificity values of 97.02 and 99.39
correspondingly. Interestingly, on evaluating the outcomes of several techniques with respect
to accuracy, the FSCGACA demonstrates highest accuracy of 98.09 while the evaluated
techniques RBF, LR, RF, RT, and DT achieve an accuracy of 92.93, 97.10, 93.04, 95.55 and
95.53 correspondingly. In continue with, it is represented with the LR revealed capable
outcomes through a maximum F-score of 97.29 above the other techniques. Still, the
presented FSCGACA classifiers are found to be efficient by the highest F-score of 98.24.
Besides, the FSCGACA generates highest predictive action evaluated to other techniques by
the maximum kappa value of 96.16.
The International journal of analytical and experimental modal analysis
Volume XI, Issue XI, November/2019
ISSN NO: 0886-9367
Page No:846
Fig. 3. Classification performance analysis on IDS dataset
4. Conclusion
The intrinsic features of MANET‟s circulated framework and unstable network topology
makes a classical centralized observing concept became ineffective in MANET. Due to the
lack of the node‟s physical security, the attacker‟s may simply cooperate the nodes. The
nature of wireless links employed for communication in MANET poses a major challenge for
security. So, it becomes essential to introduce an intrusion detection system (IDS) for
MANET. In this paper, fitness-scaling chaotic genetic ant colony algorithm (FSCGACA)
based classifier technique is presented to detect intrusions in MANET. The projected
technique undergoes validation using benchmark KDD Cup 99 dataset. When the comparison
of the presented model takes place with the existing methods, the FSCGACA model offered
supreme results over in diverse dimensions.
References
[1] Y. Kim, “Remote sensing and control of an irrigation system using a distributed
wireless sensor network,” IEEE Trans. Instrum. Meas., vol. 57, no. 7, pp. 1379–1387,
Jul. 2008
[2] Y. Zhang, W. Lee, Y. Huang, ID techniques for mobile wireless networks, Wireless
Networks 9 (5) (2003) 545–556.
75
80
85
90
95
100
Sensitivity Specificity Accuracy F-score Kappa
Measures
FSCGACA RBFNetwork LR Random Forest Random Tree Decision Tree
The International journal of analytical and experimental modal analysis
Volume XI, Issue XI, November/2019
ISSN NO: 0886-9367
Page No:847
[3] Y. Huang, W. Lee, A cooperative IDS for ad hoc networks, in: Proceedings of the 1st
ACM Workshop on Security of Ad Hoc and Sensor Networks (SASN‟03), Fairfax,
VA, USA, 2003, pp. 135–147
[4] H. Deng, Q. Zeng, D.P. Agrawal, SVM-based IDS for wireless ad hoc networks, in:
Proceedings of the 58th IEEE Vehicular Technology Conference (VTC‟03), vol. 3,
Orlando, FL, USA, 6–9 October 2003, pp. 2147–2151.
[5] Y. Liu, Y. Li, H. Man, Mac layer anomaly detection in ad hoc networks, in:
Proceedings of the 6th Annual IEEE SMC Information Assurance Workshop (IAW
‟05), West Point, NY, USA, 15–17 June 2005, pp. 402–409.
[6] S. Bose, S. Bharathimurugan, A. Kannan, Multi-layer intergraded anomaly ID for
mobile ad hoc networks, in: Proceedings of the IEEE International Conference on
Signal Processing, Communications and Networking (ICSCN 2007), February 2007,
pp. 360–365.
[7] J.B.D. Cabrera, C. Gutiérrez, R.K. Mehra, Ensemble methods for anomaly detection
and distributed ID in mobile adhoc networks, In Information Fusion 9 (2008) 96–119.
[8] W. Shim, G. Kim, S. Kim, A distributed sinkhole detection method using cluster
analysis, Expert Systems with Applications 37 (12) (2010) 8486–8491.
[9] J. Yang, J. Deng, S. Li, and Y. Hao, “Improved traffic detection with support vector
machine based on restricted boltzmann machine,” Soft Computing, vol. 21, no. 11,
pp. 3101–3112, 2017.
[10] Z. Wang, “The applications of DL on traffic identification,” BlackHat USA, 2015.
[11] R. C. Staudemeyer and C. W. Omlin, “Extracting salient features for network ID
using ML methods,” South African computer journal, vol. 52, no. 1, pp. 82–96, 2014
[12] H. Sak, A. Senior, and F. Beaufays, “Long short-term memory recurrent neural
network architectures for large scale acoustic modeling,” in Fifteenth Annual
Conference of the International Speech Communication Association, 2014.
[13] Mazraeh, S., Ghanavati, M., Neysi, S.H.N., 2016. IDS with decision tree and combine
method algorithm. Int. Acad. J. Sci. Eng. 3, 21–31.
[14] KDDCup. (1999).Available
at:http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
The International journal of analytical and experimental modal analysis
Volume XI, Issue XI, November/2019
ISSN NO: 0886-9367
Page No:848