Energy Efficient Clustering and Data AggregationProtocol ...xajzkjdx.cn/gallery/61-june2020.pdf ·...

24
Energy Efficient Clustering and Data AggregationProtocol using Machine Learning in Wireless Sensor Networks Sangeeta Kumari 1* , Ph.D. Student, Maharishi University of Information Technology, Lucknow, UP Dr. Rajat Updhyay 2 , Ph.D. Guide, Maharishi University of Information Technology, Lucknow, UP Dr Vivek Deshapande 3 , Ph.D., Maharishi University of Information Technology, Lucknow, UP Abstract: As the Wireless Sensor Networks (WSNs) are resource-constrained, energy-efficient data transmission required by considering various applications. The novel routing protocol designed to achieve the scalability, energy efficiency, QoS optimization with minimum overhead in this research called Strong Clustering Algorithm & Data Aggregation using Machine Learning (SCADA-ML). The SCADA-ML design is mainly based on the use of machine learning techniques for CH selection and robust data aggregation to minimize the energy consumption while maintaining the other performances for different size of WSNs. In the first contribution, we focused on optimal CH selection and cluster formation using the supervised ML technique called Artificial Neural Network (ANN). The problem of optimal CH selection for each cluster is formulated according to the architecture of ANN (input layer hidden layer, and output layer) in which the every sensor node properties such as residual energy, distance from the BS, and bandwidth allocated are processed as input to ANN. At CH node, there may be the possibility of redundant information, therefore in the second contribution the efficient data aggregation performed by CH node of each cluster to minimize the energy consumption using the Independent Component Analysis (ICA) ML technique. The clusters with similar data need to perform the data aggregation. As compared to other data aggregation methods, ICA is computation efficient and reduces the redundant data based on differential entropy. The experimental results show that SCADA-ML outperforms the existing ML-based clustering and data aggregation algorithms. Keywords: Artificial neural network, clustering, cluster head selection, data aggregation, data transmission, machine learning, independent component analysis. Journal of Xi'an University of Architecture & Technology Volume XII, Issue VI, 2020 ISSN No : 1006-7930 Page No: 628

Transcript of Energy Efficient Clustering and Data AggregationProtocol ...xajzkjdx.cn/gallery/61-june2020.pdf ·...

Page 1: Energy Efficient Clustering and Data AggregationProtocol ...xajzkjdx.cn/gallery/61-june2020.pdf · Energy Efficient Clustering and Data AggregationProtocol using Machine Learning

Energy Efficient Clustering and Data AggregationProtocol using Machine

Learning in Wireless Sensor Networks

Sangeeta Kumari1*, Ph.D. Student, Maharishi University of Information Technology, Lucknow, UP

Dr. Rajat Updhyay2, Ph.D. Guide, Maharishi University of Information Technology, Lucknow, UP

Dr Vivek Deshapande3, Ph.D., Maharishi University of Information Technology, Lucknow, UP

Abstract: As the Wireless Sensor Networks (WSNs) are resource-constrained,

energy-efficient data transmission required by considering various applications. The

novel routing protocol designed to achieve the scalability, energy efficiency, QoS

optimization with minimum overhead in this research called Strong Clustering

Algorithm & Data Aggregation using Machine Learning (SCADA-ML). The

SCADA-ML design is mainly based on the use of machine learning techniques for

CH selection and robust data aggregation to minimize the energy consumption while

maintaining the other performances for different size of WSNs. In the first

contribution, we focused on optimal CH selection and cluster formation using the

supervised ML technique called Artificial Neural Network (ANN). The problem of

optimal CH selection for each cluster is formulated according to the architecture of

ANN (input layer hidden layer, and output layer) in which the every sensor node

properties such as residual energy, distance from the BS, and bandwidth allocated

are processed as input to ANN. At CH node, there may be the possibility of

redundant information, therefore in the second contribution the efficient data

aggregation performed by CH node of each cluster to minimize the energy

consumption using the Independent Component Analysis (ICA) ML technique. The

clusters with similar data need to perform the data aggregation. As compared to other

data aggregation methods, ICA is computation efficient and reduces the redundant

data based on differential entropy. The experimental results show that SCADA-ML

outperforms the existing ML-based clustering and data aggregation algorithms.

Keywords: Artificial neural network, clustering, cluster head selection, data

aggregation, data transmission, machine learning, independent component analysis.

Journal of Xi'an University of Architecture & Technology

Volume XII, Issue VI, 2020

ISSN No : 1006-7930

Page No: 628

Page 2: Energy Efficient Clustering and Data AggregationProtocol ...xajzkjdx.cn/gallery/61-june2020.pdf · Energy Efficient Clustering and Data AggregationProtocol using Machine Learning

I. Introduction

The Wireless Sensor Network (WSN) is consisting of sensor nodes. As a sensor

node has been small & have less power, energy consumption major concern &

deciding factor about network overall lifetime. Data collection, processing, sending,

receiving, forwarding processes consume more sensor node energy. Under WSN,

there has been several factors basis on which the energy efficiency has been

determined like the architecture of WSN, topology design, routing protocol, MAC

(medium access control) protocol, data aggregation schemes, etc [1]. The key terms

for WSNs have been energy efficiency & network lifetime defined as:

- Energy Efficiency: The processing of WSN should be extended as much as

possible. Under normal topology control protocol, every sensor consumes

similar energy for each network round either second. However, the topology

control protocol has been energy efficient if it has been able to extend the

overall network lifetime of WSNs. For every sensor node, energy

consumption should be minimized if we consider the fact that all sensor

nodes have been having similar importance.

- Network lifetime: The term network lifetime has been nothing but a number

of data collection rounds either overall life under minutes till to the first

sensor node under WSN dies. For example, under some WSN applications it

has been needed that operation of all sensor nodes should be done together,

then under such case lifetime has been nothing but the total number of rounds

of network till the first sensor node dies.

At the present time, energy consumption has been the most important research

problem for WSNs. Why energy efficiency has been necessary has been explained

as: 1) the single small radio sensor device has been having low power battery which

has been expected to operate several months after its deployment [2]. 2) If designing

& deployment of WSN has been done over the inaccessible region, then it’s needed

that all wireless sensor devices under such networks utilize their batteries efficiently

so that network lifetime should be extended.

Journal of Xi'an University of Architecture & Technology

Volume XII, Issue VI, 2020

ISSN No : 1006-7930

Page No: 629

Page 3: Energy Efficient Clustering and Data AggregationProtocol ...xajzkjdx.cn/gallery/61-june2020.pdf · Energy Efficient Clustering and Data AggregationProtocol using Machine Learning

The network layer protocols like routing methods have been the main component of

WSNs where the major operations performed & hence the energy consumption

concern among those operations. Several types of routing protocols designed to

achieve energy efficiency under WSNs [3] [4]. The clustering basis routing protocols

showed the better outcomes for energy efficiency compared to other techniques. The

cluster formation & CH selection has been also the NP-hard research problem & to

address the challenges various methodologies for optimal CH selection & cluster

formation presented so far. Utilizing clustering methods, data aggregation has been

another research problem under which the redundant data collected at CH node,

therefore there has been a need for appropriate method to fuse the common data

through discarding the duplicate information to reduce the transmissions as well as

energy consumption. The data aggregation has been performed through the CH node.

Creating productive calculations that has been reasonable for various application

situations has been a difficult errand. Specifically, WSN architects need to deliver

regular issues identified along information accumulation, information unwavering

quality, limitation, hub grouping, vitality mindful directing, occasions planning,

deficiency discovery, & security. ML was presented under the late 1950s as a

method for artificial intelligence (computer basis intelligence) [5]. After some time,

its center developed & moved more to calculations that has been computationally

reasonable & powerful. Under the most recent decade, AI systems have been utilized

broadly for a wide scope of errands including order, relapse & thickness estimation

under an assortment of utilize zones, for example, bioinformatics, discourse

acknowledgment, spam discovery, computer vision, misrepresentation identification,

& publicizing systems. ML strategies have been focal under creating WSN

applications since the beginning of WSNs, the same number of the issues under

WSNs could be put as advancement either displaying issues. ML under WSN assists

along finding important new connections, examples, & patterns, regularly already

obscure, through filtering through a lot of information, utilizing design

acknowledgment, factual & scientific procedures. It very well may be valuable not

just under information revelation, that has been, the recognizable proof of new

wonders, yet under addition it can help under improving our understanding of known

marvels. At the end of the day, AI systems can help assemble choice guide devices

Journal of Xi'an University of Architecture & Technology

Volume XII, Issue VI, 2020

ISSN No : 1006-7930

Page No: 630

Page 4: Energy Efficient Clustering and Data AggregationProtocol ...xajzkjdx.cn/gallery/61-june2020.pdf · Energy Efficient Clustering and Data AggregationProtocol using Machine Learning

& encourage investigating of sensor information acquired from WSNs. The two

concerns of WSNs like clustering (i.e. optimal CH selection) & data aggregation can

be solved for employing the ML techniques effectively as the key motivation of this

paper. From the literature review, it has been noticed that the various researchers

having various applications, preferences, & assumptions under utilizing ML

methods. Such differences & assumptions lead the main challenge for the other

researchers to build upon the present works. Hence, the generalized framework for

WSN machine learning has been necessary. In this paper, designing the generalized

ML-basis WSN algorithms (clustering & data aggregation) that can achieve the

tradeoffs between QoS performances & energy efficiency of WSNs regardless of

end-user applications is main objective of this research. In section II, brief review of

ML-based clustering and data aggregation methods presented. In section III, the

design of SCADA-ML protocol presented. In section VI, experimental results are

presented. In section V, conclusion and future work presented.

II. Related Works

This section presents the review of recent works of ML-based clustering and data

aggregation for WSNs.

A. ML-based Clustering

In [6]-[19], various algorithms designed for optimal clustering and CH selection

using the ML techniques. In [6], creator proposed productive cross breed vitality

mindful grouping correspondence convention for green IoT system registering; Hy-

IoT, yet likewise gives a genuine IoT organize design for looking proposed

convention contrasted with usually existed conventions. Effective bunch head

determination supports the utilization of the hubs vitality substance and thusly builds

the system life time and furthermore the bundles transmission rate to the base station.

Hy-IoT utilized different weighted race probabilities for choosing a Cluster-head in

context of heterogeneity dimension of the locale. In [7], creator revived Internet of

Things (IoT) gadgets is particularly used in different fields, for example, regular

checking, associations, fast home and so on. Under such occurrence, a group head is

chosen among the diverse IoT gadgets of WSN based IoT system to keep up the time

Journal of Xi'an University of Architecture & Technology

Volume XII, Issue VI, 2020

ISSN No : 1006-7930

Page No: 631

Page 5: Energy Efficient Clustering and Data AggregationProtocol ...xajzkjdx.cn/gallery/61-june2020.pdf · Energy Efficient Clustering and Data AggregationProtocol using Machine Learning

tested system with proficient information transmission. To achieve the productive

group head choice, they utilized Fuzzy C-Means (FCM) bunching count. In [8],

creator proposed IoT related issues mostly on vitality proficiency. The development

of these contraptions in a conveying inciting framework that makes the Internet of

Things (IoT) captivating, wherein sensors and actuators were blend reliably with

nature around us, and the information is shared across over stages to develop a

regular working picture.

In [9]-[15], the protocols designed for IoT applications. In [9], proposed the different

directing answers for buy in techniques that incorporate substance and setting based

steering for IoT empowered WSNs. They planned the Energy-Efficient Content-

Based Routing (EECBR) convention for the IoT that limits the vitality utilization in

WSNs. In [10] [11], creators talked about the requirement for green IoT and the

different software and equipment based advancements required to empower its

acknowledgment. Vitality proficient between hub correspondence and improved

steering systems has been recognized as the issues that should be routed to

encourage enormous scale appropriation of green IoT. In [12], the distinctive

methodology proposed wherein an evaluated open detecting structure is proposed for

open information conveyance accumulated from cloud and heterogeneous assets.

The work is information driven, centered on free market activity chain of open

information from cell phones. In [13], creator proposed information gathering in cell

gadgets utilizing gadget to gadget interchanges in an IoT and Smart City setting.

This outcomes in progressively productive asset use and limits vitality utilization.

They utilize one gadget that totals information from a few encompassing gadgets and

after that sends the information to cell station, rather than every gadget is sending

information independently. In [14], a diagram of utilizing customary WSN

conventions for accomplishing gadget to gadget correspondence in IoT has been

exhibited. In [15], the latest work proposed for the information total in WSNs

utilizing the fluffy c-implies bunching approach. They structured similitude mindful

information collection utilizing a fluffy c-implies approach. The fluffy c means used

to play out the bunching to sort out sensors into groups dependent on information

closeness. They utilized help degree work for the anomaly discovery.

Journal of Xi'an University of Architecture & Technology

Volume XII, Issue VI, 2020

ISSN No : 1006-7930

Page No: 632

Page 6: Energy Efficient Clustering and Data AggregationProtocol ...xajzkjdx.cn/gallery/61-june2020.pdf · Energy Efficient Clustering and Data AggregationProtocol using Machine Learning

In [16], author proposed novel energy efficient routing protocol using machine

learning algorithm. The reinforcement learning technique is used for energy

efficiency. In the first step of the protocol, a new clustering method is applied to the

network and the network is established using a connected graph. Then data is

transmitted using the 𝑄-value parameter of reinforcement learning technique. In

[17], author proposed novel machine learning based approach for the sensor node

validation. Authors designed validation sensors in the domain using spectral

clustering technique have been proposed that is detecting a bad sensor and deleting it

from the domain. Sensors have been indexed by their location using simple model of

spectral clustering. In [18], author recently introduced the new routing protocol for

OppNets called MLProph based on machine learning (ML) algorithms. The ML

techniques used in this work are neural networks and decision tree to discover the

probability of successful deliveries. TheMLmodel is trained by using various factors

such as the predictability value inherited from the PROPHET routing scheme,

popularity of nodes, energy consumption of node, location of node, and mobility

speed. In [19], author proposed robust distributed clustering method without a fusion

center. The algorithm combines distributed eigenvector computation and distributed

𝐾-means clustering. A distributed power iteration method is used to compute the

eigenvector of the graph Laplacian.

B. ML-based Data Aggregation

In [20], an alternate strategy has been proposed under the Center at Nearest (CNS)

calculation. Under this calculation, every center that recognizes an event sends its

data to a specific center point, called aggregator, through using a most short way. For

this situation, the aggregator has been the nearest center to the sink (under bounces)

that recognizes an event.

In [21], the Directed Diffusion calculation has been one of the soonest answers for

moreover proposed characteristic premise controlling. Under these cases,

information can be keenly totaled when they meet at any transitional center point.

Journal of Xi'an University of Architecture & Technology

Volume XII, Issue VI, 2020

ISSN No : 1006-7930

Page No: 633

Page 7: Energy Efficient Clustering and Data AggregationProtocol ...xajzkjdx.cn/gallery/61-june2020.pdf · Energy Efficient Clustering and Data AggregationProtocol using Machine Learning

In [22], under perspective on Directed Diffusion, the Greedy Incremental Tree (GIT)

approach was proposed. The GIT calculation develops essentialness beneficial way

and eagerly interfaces different sources onto the developed way.

In [23] and [24], information accumulation has been realized under a certified world

testbed and the Tiny Aggregation Service (TAG) structure has been introduced.

Name uses a most restricted way tree, and proposes upgrades, for instance, sneaking

around premise and hypothesis testing-premise progressions, dynamic parent

trading, and the utilization of an adolescent hold to evaluate information adversity.

In [25], author proposed an information conglomeration strategy for remote sensor

frameworks using fake neural frameworks. The information combination tree has

been developed to diminish the packs stream and can revive the leaf centers

intensely.

In [26] author introduced a joint arrangement of information assortment along the

coordinating innovation, and displayed a network premise guiding and aggregator

determination intend to achieve low essentialness dissemination and low idleness

without giving up quality. Through inquiring about information combination along

correspondence imperative between the combination center and each sensor.

In [27] author showed an information combination instrument for target following

under remote sensor frameworks reliant on quantized improvements and Kalman

filtering.Through including some defer time, every one of the information gathered

through transfer hub can be combined at one time under order to decrease the vitality

utilization.Planning to guarantee the information quality

In [28] author proposed various measurements for QoS (nature of administration)

during the time spent information total, including lifetime, information deferral, &

retransmission rate. Likewise, the methodology has been examined to guarantee

above QoS measurements under subtleties. Likewise to the tree-basis methodologies,

bunch basis plans additionally comprise of a various levelled association of the

Journal of Xi'an University of Architecture & Technology

Volume XII, Issue VI, 2020

ISSN No : 1006-7930

Page No: 634

Page 8: Energy Efficient Clustering and Data AggregationProtocol ...xajzkjdx.cn/gallery/61-june2020.pdf · Energy Efficient Clustering and Data AggregationProtocol using Machine Learning

system. Be that as it may, under these methodologies, hubs have been isolated into

groups.

III. SCADA-ML Protocol

This section presents the design of proposed SCADA-ML protocol for WSNs by

considering the design of ML-based clustering and ML-based data aggregation. The

SCADA-ML proposed for generalized solutions to enhance the network lifetime,

improve QoS performance with acceptable or minimum overhead for WSNs by

using the ML techniques. The SCADA-ML routing protocol proposed consists of

two contributions such as:

ML based Optimal CH Selection: In the main commitment, we concentrated

on optimal CH selection and bunch formation utilizing the directed ML

technique called Artificial Neural Network (ANN). The issue of optimal CH

selection for each group is formulated by the architecture of ANN (input

layer concealed layer, and yield layer) in which the each sensor hub

properties such as remaining energy, good ways from the BS, and data

transmission allocated are prepared as contribution to ANN. The usefulness

of the concealed layer performed to choose CH in the yield layer utilizing the

versatile learning of ANN. After the clustering, the between group and intra-

bunch data transmissions performed.

ML based Efficient Data Aggregation: At CH hub, there might be the

chance of repetitive information, along these lines in the second commitment

the efficient data aggregation performed by CH hub of each bunch to limit

the energy utilization utilizing the Independent Component Analysis (ICA)

ML technique. The scales with comparable data need to play out the data

aggregation. When contrasted with other data aggregation strategies, ICA is

computation efficient and diminishes the repetitive data based on differential

entropy.

Journal of Xi'an University of Architecture & Technology

Volume XII, Issue VI, 2020

ISSN No : 1006-7930

Page No: 635

Page 9: Energy Efficient Clustering and Data AggregationProtocol ...xajzkjdx.cn/gallery/61-june2020.pdf · Energy Efficient Clustering and Data AggregationProtocol using Machine Learning

A. ML-based Clustering and Data Transmission

In traditional methods of clustering in sensor network, cluster heads selections

are mainly rely on two parameters firstly residual energy and secondly distance

from base station node whereas in Machine learning based clustering, the cluster

head are elected on basis of cost metrics function. In traditional approach of

clustering of WSN, there are overcrowded cluster heads whereas in sparse area

there would be very few clusters heads or there is possibility that no cluster head

is present. Such methodology of clustering leads to reduce the network lifetime

which can be avoidedby using Artificial Neural network (ANN based approach)

for selection of Cluster head. For electing cluster heads ANN uses layered

architecture with primarily three types of layers input layer, hidden layer, and

output layers. Figure 1 shows the proposed ANN based CH election architecture.

Figure 1. Proposed ANN based CH selection approach

Journal of Xi'an University of Architecture & Technology

Volume XII, Issue VI, 2020

ISSN No : 1006-7930

Page No: 636

Page 10: Energy Efficient Clustering and Data AggregationProtocol ...xajzkjdx.cn/gallery/61-june2020.pdf · Energy Efficient Clustering and Data AggregationProtocol using Machine Learning

Artificial Neural Networks (ANN) could find applications in various domains of our

life including fraud detection, speech recognition, health care applications,

handwriting recognition, face recognition applications or pattern recognition

applications and many more. As ANN has good potential to provide the solutions of

numerous problems because of good learning abilities through hidden layers.This

paper provides an optimal solution for election of periodic cluster head through

Artificial Neural Network (ANN). ANN based cluster head selection is not only fast,

simple to implement and flexible rather it is adaptive too, which makes this

architecture most suitable for adhoc networks

As showing in figure 1, the two layer feed-forward NN designed to select the

optimal CH node from set of sensor nodes. The input layer consist of n number of

sensor nodes 𝑆 = {𝑆1,𝑆2, … , 𝑆𝑛} those are competing each other in hidden layer to

become the CH. In hidden layer, each node evaluated through its cost function

𝑓(𝑆𝑖, 𝐶𝑖), 1 ≤ 𝑖 ≤ 𝑛 which is computed using two parameters residual energy and

distance from BS node as:

𝑓(𝑆𝑖, 𝐶𝑖) = 𝐶𝐸𝑆𝑖 + (200 − 𝑑𝑖𝑠𝑡 (𝑆𝑖, 𝐵𝑆)) (1)

Where, 𝐶𝐸𝑆𝑖 denotes the current available energy of node 𝑆𝑖 and 𝑑𝑖𝑠𝑡 (𝑆𝑖, 𝐵𝑆)

computes the geographical distance from node 𝑆𝑖 to BS.

Algorithm 1 elaborates the process of Cluster head (CH) election. Artificial neural

network based competitive learning technique has been used in this paper for

evaluations of cost functions of all the sensor nodes. Each nodes of input layers as

shown in the figure represents sensor nodes of the cluster and its cost function value

computed prior applying the hidden layer.

At second layer which is hidden layer of neurons as shown in figure 1,

contendamong themselvesand neuron with highest cost function

value𝑓(𝑆𝑖, 𝐶𝑖)𝑖𝑠𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑. As each neurons of hidden layer is based on the

adaptive learning technique where the learning rate ε decides the vector adaption

towards the input pattern and it is directly concerning with convergence. If learning

rate is zero (ε =0) means, it there is no learning. If learning rate is one (ε =1), it

means learning is very fast. Generally, the learning rate arekept fixed value over the

Journal of Xi'an University of Architecture & Technology

Volume XII, Issue VI, 2020

ISSN No : 1006-7930

Page No: 637

Page 11: Energy Efficient Clustering and Data AggregationProtocol ...xajzkjdx.cn/gallery/61-june2020.pdf · Energy Efficient Clustering and Data AggregationProtocol using Machine Learning

period of time. For periodic selection of optimal cluster head(CH) ,the process is

repeated at regular interval of time in a network.

Algorithm 1: ANN-based CH election

Input

𝑠 = {𝑠1,𝑠, … , 𝑠𝑛} ,

𝐿𝑒𝑎𝑟𝑛𝑖𝑛𝑔 𝑟𝑎𝑡𝑒 𝜀 𝑤ℎ𝑒𝑟𝑒 0 ≤ 𝜀 ≤ 1

Output

𝐶𝐻: 𝑒𝑙𝑒𝑐𝑡𝑒𝑑 𝐶𝐻 𝑓𝑟𝑜𝑚 𝑠𝑒𝑡 𝑤

Input Layer

1. Initialize 𝑤 = {𝑤1,𝑤, … , 𝑤𝑛} at input layer for the 𝐶𝐻 competition

2. Initialize the cost function for each sensor of input layer 𝑓(𝑤𝑖, 𝐶𝑖) = 0.

Hidden Layer

3. For each layer

4. Compute the cost function 𝑓(𝑤, 𝐶𝑖) for each sensor using Eq. (1)

5. 𝑓(𝑤𝑖, 𝐶𝑖)= neuron 𝑛𝑖 as weight function.

6. Neurons are competition each other with learning rate 𝜀

7. Update 𝑛𝑖 weight value after each learning 𝜀

𝑓(𝑤𝑖, 𝐶𝑖) = 𝑓(𝑤𝑖, 𝐶𝑖) + 𝜀(𝑤𝑖 − 𝑓(𝑤𝑖, 𝐶𝑖))

8. Estimate the 𝑛𝑖with highest cost function value

9. Repeat steps 4-8.

10. End For

Output Layer

11. The estimated neuron 𝑛𝑖labelled to actual sensor node 𝑤𝑖

12. 𝐶𝐻 = 𝑤𝑖

13. Return 𝐶𝐻

After the cluster formation, according to the objective function defined equation (2),

this section presents the route formation and data transmission functionality in

clustered WSN. The link cost is among two nodes 𝑤𝑖 and 𝑤𝑗 is defined as the

amount of energy consumed to send and receive the packet successfully. The link

cost model is represented as:

Journal of Xi'an University of Architecture & Technology

Volume XII, Issue VI, 2020

ISSN No : 1006-7930

Page No: 638

Page 12: Energy Efficient Clustering and Data AggregationProtocol ...xajzkjdx.cn/gallery/61-june2020.pdf · Energy Efficient Clustering and Data AggregationProtocol using Machine Learning

LC = (𝐸𝑖

𝐷

𝐸𝑡(𝑆𝑖,𝑆𝑗)+𝐸𝑟(𝑆𝑖,𝑆𝑗)) (2)

Where 𝐸𝑖𝐷is energy associated with the delivery ratioof the packet originating from

source node 𝑆𝑖and correctly received at destination node, while𝐸𝑡 (𝑆𝑖, 𝑆𝑗) is the

energy used in transmitting from 𝑆𝑖 to 𝑆𝑗and 𝐸𝑟 (𝑆𝑖, 𝑆𝑗) is the energy used in

receiving thepacket. Data routing from every CH to the sink is done over multi-hop

paths, which is given byminimizing equation (2).

Consider that BS node labeled as 0, and all the current CH nodes labeled as 𝐶𝐻𝑖,

where 𝑖 = 1, 2, … 𝑘. The problem is formulated as with objective:

Minimize ∑ 𝐿𝐶1≤𝑖≤𝑘

Subject to following constraints

∑ 𝑃𝑖𝑗1≤𝑗≤𝑘 - ∑ 𝑃𝑗𝑖1≤𝑗≤𝑘 = 𝑝𝑖 (3)

𝑃𝑖𝑗 ≥ 0, 1 ≤ 𝑗 ≤ 𝑘, (4)

𝑝 <= 𝐸𝑡ℎ𝑟 (5)

Where, the constraint (3) represents the amount of data transmitted 𝑝𝑖, constraint (4)

represents the 1 or more packets to be transmitted among two nodes. The constrain

(5) limits the maximum energy consumption 𝑝 of any sensor node should be below

the predefined energy threshold value 𝐸𝑡ℎ𝑟.

B. ML-based Data Aggregation

This is another contribution for SCADA-ML routing protocol. Data aggregation

leads the reduced number of transmissions which further optimizes the energy

efficiency and QoS performances.

Journal of Xi'an University of Architecture & Technology

Volume XII, Issue VI, 2020

ISSN No : 1006-7930

Page No: 639

Page 13: Energy Efficient Clustering and Data AggregationProtocol ...xajzkjdx.cn/gallery/61-june2020.pdf · Energy Efficient Clustering and Data AggregationProtocol using Machine Learning

Figure 2. ICA-based data aggregation in SCADA-ML protocol

For data aggregation phase, we proposed the ML-based approach in this work. In

this work, after the node deployment, the clustering performed using the ANN based

approach. Further in data aggregation phase, the data similarity among the

neighbouring sensor nodes computed in the form of data correlation. Using the

predefined value, the decisions regarding to whether the two data items belongs to

cluster with similar data or not. If the correlation value more than set threshold value,

then two data items is belong to cluster with same data. The data aggregation applied

Journal of Xi'an University of Architecture & Technology

Volume XII, Issue VI, 2020

ISSN No : 1006-7930

Page No: 640

Page 14: Energy Efficient Clustering and Data AggregationProtocol ...xajzkjdx.cn/gallery/61-june2020.pdf · Energy Efficient Clustering and Data AggregationProtocol using Machine Learning

on only the similar data clusters to prevent the loss of information. The data clusters

with different data perform the data transmission operations without applying the

data aggregation technique.

The Independent Component Analysis (ICA) algorithm proposed for cluster-based

data aggregation in this work which is applied on similar data cluster. The ICA

applied by the CH node. The ICA method performs efficiently as compared to other

techniques like PCA and compression methods. Itminimizes mutual information

using the concept of differentialentropy. Aggregated data from similar clustersare

forwarded to sink node. Hence the computation andenergy consumption can be

reduced due to less numberof aggregation processes. Figure 2 shows the working of

ICA-based data aggregation in SCADA-ML protocol.

III. Experimental Results

This section presents the simulation results of SCADA-ML protocol in two sections.

In section A, we present the comparative analysis with existing ML-based clustering

protocols. In section B, we present the comparative analysis with existing ML-based

data aggregation protocols. The simulation parameters used for both studies are

described in table 1. The performances are measured in terms of average throughput,

average delay, Packet Delivery Ratio (PDR), average energy consumption, and

network lifetime.

Table 1. Density variation parameters

Sensor nodes 40-400

BS position (150, 160)

Simulation Time 150 seconds

Network size 150 x 150

MAC 802.11

Packet size 512 bytes

Sensor nodes transmission range 100 m

Channel bit rate 10 kbps

Initial energy 5000 nJ

Journal of Xi'an University of Architecture & Technology

Volume XII, Issue VI, 2020

ISSN No : 1006-7930

Page No: 641

Page 15: Energy Efficient Clustering and Data AggregationProtocol ...xajzkjdx.cn/gallery/61-june2020.pdf · Energy Efficient Clustering and Data AggregationProtocol using Machine Learning

Mobility Model Random Way-point model

Sensor speed 0 m/s

Packet sending rate 20 packets/sec.

A. ML-based Clustering Protocols Evaluation

The proposed clusteringalgorithm SCADA-ML is compared with existing ML-based

clustering protocols such as Decision Tree (CHDT) [29], clustering using Q-learning

(QLIQUE) [30], and recent clustering using K-means (CHKM) [31] protocols.

Figure 2. Average throughput analysis vs. density of ML-based clustering methods

This section presents the performance investigation of proposed SCADA-ML (with

ML-based data aggregation technique) with the various ML-based clustering

methods. The figure 2 and 3 demonstrate the average throughput and PDR

performances with varying number of sensor nodes. From these results it is noticed

that increased density leads to increased throughput and PDR performances using all

the ML-based clustering techniques. The SCADA-ML protocol delivered the

improved performances for throughput and PDR compared to other techniques due

to innovative clustering and data transmission algorithms proposed. In SCADA-ML,

along with optimal CH selection the focus is on the optimal data transmission as well

by considering the energy and distance constraints. Among the other ML-based

0

20

40

60

80

100

120

40 80 120 160 200 240 280 320 360 400

Ave

rage

th

rou

ghp

ut

(kb

ps)

Number of Sensor nodes

CHDT QLIQUE CHKM SCADA-ML

Journal of Xi'an University of Architecture & Technology

Volume XII, Issue VI, 2020

ISSN No : 1006-7930

Page No: 642

Page 16: Energy Efficient Clustering and Data AggregationProtocol ...xajzkjdx.cn/gallery/61-june2020.pdf · Energy Efficient Clustering and Data AggregationProtocol using Machine Learning

clustering methods, the recent K-means algorithm initiated clustering having better

performances than Q-learning and decision tree machine learning algorithms.

Figure 3. PDR analysis vs. density of ML-based clustering methods

Figure 4. Average end to end delay analysis vs. density of ML-based clustering

methods

Figure 4 showing the outcomes of average end to delay using different ML-based

clustering protocols. The delay increasing significantly with increased number of

sensor nodes in network due to long links establishment and increased number of

data transmissions. The SCADA-ML protocol shows the reduction in delay

0

20

40

60

80

100

120

40 80 120 160 200 240 280 320 360 400

PD

R (

%)

Number of Sensor nodes

CHDT QLIQUE CHKM SCADA-ML

0

0.5

1

1.5

2

2.5

3

40 80 120 160 200 240 280 320 360 400

De

lay

(Se

con

ds)

Number of Sensor nodes

CHDT QLIQUE CHKM SCADA-ML

Journal of Xi'an University of Architecture & Technology

Volume XII, Issue VI, 2020

ISSN No : 1006-7930

Page No: 643

Page 17: Energy Efficient Clustering and Data AggregationProtocol ...xajzkjdx.cn/gallery/61-june2020.pdf · Energy Efficient Clustering and Data AggregationProtocol using Machine Learning

compared to existing methods as the data transmission performed through the robust

link optimisation technique along with the optimal CH selection compared to

existing methods. The another point observed in these results that QLIQUE shows

the less delay performance compared to CHKM protocol as the Q-learning takes less

time for operations compared to K-means.

Figure 5. Average energy consumption analysis vs. density of ML-based clustering

methods

Figure 6. Network lifetime analysis vs. density of ML-based clustering methods

Figure 5 and figure 6 demonstrates the energy efficiency performances of ML-based

clustering protocols in terms of average energy consumption and network lifetime

0

200

400

600

800

1000

1200

40 80 120 160 200 240 280 320 360 400Ave

rage

co

nsu

me

d e

ne

rgy

(nJ)

Number of Sensor nodes

CHDT QLIQUE CHKM SCADA-ML

0

2000

4000

6000

8000

10000

12000

40 80 120 160 200 240 280 320 360 400

Ne

two

rk L

ife

tim

e (

Ro

un

ds)

Number of Sensor nodes

CHDT QLIQUE CHKM SCADA-ML

Journal of Xi'an University of Architecture & Technology

Volume XII, Issue VI, 2020

ISSN No : 1006-7930

Page No: 644

Page 18: Energy Efficient Clustering and Data AggregationProtocol ...xajzkjdx.cn/gallery/61-june2020.pdf · Energy Efficient Clustering and Data AggregationProtocol using Machine Learning

respectively. The energy consumption and network lifetime performances are

correlated to each other, it means if the energy consumption increased, then the

network lifetime decreased. From both outcomes of energy efficiency, it shows the

SCADA-ML achieved the significant improvement in energy efficiency performance

over the existing protocols as the energy aware cost function designed to establish

the data forwarding in network as well as optimal CH selection performed using

energy as core parameter. The other ML-based clustering methods only work on

optimal CH selection using ML technique by exploiting energy parameter of sensor

nodes and do not bother about the energy efficient data transmission.

B. ML-based Data Aggregation Protocols Evaluation

In this section, SCADA-ML with data aggregation is compared with other existing

ML-based data aggregation techniques such as data aggregation with self

organization map (CODA) [32] and data aggregation using PCA (DAPCA) [33].

Figure 7. Average throughput analysis vs. density

0

50

100

150

200

250

40 80 120 160 200 240 280 320 360 400

Ave

rage

th

rou

ghp

ut

(kb

ps)

Number of Sensor nodes

CODA DAPCA SCADA-ML

Journal of Xi'an University of Architecture & Technology

Volume XII, Issue VI, 2020

ISSN No : 1006-7930

Page No: 645

Page 19: Energy Efficient Clustering and Data AggregationProtocol ...xajzkjdx.cn/gallery/61-june2020.pdf · Energy Efficient Clustering and Data AggregationProtocol using Machine Learning

Figure 8. PDR analysis vs. density of ML-based aggregation methods

The last model of SCADA-ML investigated in this segment with consideration of

ML-based aggregation technique alongside ML-based clustering calculation. The

presentation of SCADA-ML is contrasted and late ML-based aggregation techniques

such as CODA and DAPCA. The figure 7 and 8 demonstrate the normal throughput

and PDR exhibitions with differing number of sensor hubs for each aggregation

strategy. These outcomes show that with expanded thickness, the throughput and

PDR exhibitions increments for all investigated techniques. The SCADA-ML

convention conveyed the improved throughput and PDR contrasted with different

techniques due ICA-based aggregation calculation acquainted with diminish the

quantity of transmissions and ANN-based optimal CH selection technique prompts

stable bunches in network. The optimal data transmission utilizing the connection

cost optimization promotes to improve the exhibitions of SCADA-ML.

Figure 9 demonstrates average end to delay results using different ML-based

aggregation protocols. Here the delay performance becomes worst with increased

number of sensor nodes in network due to long links establishment and increased

number of data transmissions. The delay performance of SCADA-ML is efficient

compared to CODA and DAPCA algorithms as the ICA technique is more robust,

efficient, and fast for SCADA-ML protocol.

0

20

40

60

80

100

120

40 80 120 160 200 240 280 320 360 400

PD

R (

%)

Number of Sensor nodes

CODA DAPCA SCADA-ML

Journal of Xi'an University of Architecture & Technology

Volume XII, Issue VI, 2020

ISSN No : 1006-7930

Page No: 646

Page 20: Energy Efficient Clustering and Data AggregationProtocol ...xajzkjdx.cn/gallery/61-june2020.pdf · Energy Efficient Clustering and Data AggregationProtocol using Machine Learning

Figure 9. Average end to end delay analysis vs. density of ML-based aggregation

methods

Figure 10. Average energy consumption analysis vs. density of ML-based

aggregation methods

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

40 80 120 160 200 240 280 320 360 400

De

lay

(Se

con

ds)

Number of Sensor nodes

CODA DAPCA SCADA-ML

0

200

400

600

800

1000

1200

40 80 120 160 200 240 280 320 360 400

Ave

rage

co

nsu

me

d e

ne

rgy

(nJ)

Number of Sensor nodes

CODA DAPCA SCADA-ML

Journal of Xi'an University of Architecture & Technology

Volume XII, Issue VI, 2020

ISSN No : 1006-7930

Page No: 647

Page 21: Energy Efficient Clustering and Data AggregationProtocol ...xajzkjdx.cn/gallery/61-june2020.pdf · Energy Efficient Clustering and Data AggregationProtocol using Machine Learning

Finally the figure 10 shows the average energy consumption performance of all the

ML-based aggregation techniques for each WSN. The results shows that the

SCADA-ML delivered the significant improvement in energy efficiency

performance over the existing protocols as the energy aware link cost function

designed to establish the data forwarding in network, optimal CH selection

performed using energy as core parameter, and effective ICA-based data aggregation

functions.

V. Conclusion and Future Work

The design of the SCADA-ML protocol neatly presented and described in this paper.

The ANN technique effectively used in the process of optimal CH selection and

cluster formation in SCADA-ML. The neurons weights computed using the sensor

nodes properties such as residual energy, geographical distance, and bandwidth

availability. After the clustering, the inter-cluster and intra-cluster data transmissions

performed via the link cost-based optimal path selection process to ensure the

reliability and QoS performance in WSNs. Finally, to reduce the energy

consumption, the ICA based data aggregation algorithm designed in SCADA-ML

which shows superior behaviors compared to other ML-based methods for data

aggregation. The simulation results presented with varying density and data rate

parameters to claim the scalability and reliability performances of SCADA-ML

compared to exiting ML-based clustering and data aggregation methods. The

SCADA-ML shows the improved performances for throughput, delay, PDR, and

energy efficiency over all the investigated algorithms.

References

1. Cagalj, M., Hubaux, J.-P., and Enz, C. C., “Energy-efficient broadcasting in

all-wireless networks," Wireless Networks, 11(1/2), 177–188, 2005.

2. Polastre, J., Szewczyk, R., and Culler, D., “Telos: Enabling ultra-low power

wireless research,” In Proceedings of international symposium on

information processing in sensor networks (pp. 364–369), 2005.

3. Chen, Y. P., Wang, D. and Zhang, J., “Variable-base tacit-communication: a

new energy efficient communication scheme for sensor networks,”

Journal of Xi'an University of Architecture & Technology

Volume XII, Issue VI, 2020

ISSN No : 1006-7930

Page No: 648

Page 22: Energy Efficient Clustering and Data AggregationProtocol ...xajzkjdx.cn/gallery/61-june2020.pdf · Energy Efficient Clustering and Data AggregationProtocol using Machine Learning

Proceedings of the First International Conference in Integrated Internet Ad

Hoc and Sensor Networks, InterSense 2006, Nice, France, May 30-31, 2006.

4. Chen, Y. P., Liestman, A. L., & Liu, J., “Energy-efficient data aggregation

hierarchy for wireless sensor networks,” In Proceedings of 2nd international

conference on quality of service in heterogeneous wired/wireless networks

(QShine ’05), Orlando, 2005.

5. Bharat Sundararaman, Ugo Buy and Ajay D. Kshemkalyani, “Clock

Synchronization for Wireless Sensor Networks: A Survey”, Journal of Ad

Hoc Networks, Vol.3,2005, pp. 281-323.

6. Rowayda A.Sadek, “Hybrid energy aware clustered protocol for IoT

heterogeneous network” https://doi.org/10.1016/j.fcij.2018.02.003, 2018.

7. Praveen Kumar Reddy,RajasekharaBabu“An Evolutionary Secure Energy

Efficient Routing Protocol in Internet of Things” Vellore Institute of

Technology University, Vellore, Tamil Nadu, India, 2017.

8. Vellanki M, Kandukuri SPR and Razaque A “Node Level Energy Efficiency

Protocol for Internet of Things” Vellanki et al., J Theor. Comput.Sci. 2017.

9. Samia Allaou, Chelloug, “Energy-Efficient Content-Based Routing in

Internet of Things,” Journal of Computer and Communications, 2015, 3, 9-

20Published Online December 2015 in SciRes.

10. Faisal Karim Shaikh, Sherali Zeadally, and Ernesto Exposito, “Enabling

technologies for green internet of things,” IEEE Systems Journal,11(2):983–

994, 2017.

11. Chunsheng Zhu, Victor CM Leung, Lei Shu, and Edith C-H Ngai, “Green

internet of things for smart world,” IEEE Access, 3:2151–2162, 2015.

12. Al-Fagih, A.E.; Al-Turjman, F.M.; Alsalih, W.M.; Hassanein, H.S. A priced

public sensing framework for heterogeneous IoT architectures. IEEE Trans.

Emerg. Top. Comput. 2013, 1, 133–147.

13. Orsino, A.; Araniti, G.; Militano, L.; Alonso-Zarate, J.; Molinaro, A.; Iera, A.

Energy efficient iot data collection in smart cities exploiting D2D

communications. Sensors 2016, 16, 836.

14. Bello, O.; Zeadally, S. Intelligent device-to-device communication in the

Internet of things. IEEE Syst. J. 2016, 10, 1172–1182.

15. Runze Wan1, Naixue Xiong, Qinghui Hu, Haijun Wang, and Jun Shang,

“Similarity-aware data aggregation using fuzzy c-means approach for

wireless sensor networks,” EURASIP Journal on Wireless Communications

and Networking, (2019) 2019:59.

16. Farzad Kiani, Ehsan Amiri, Mazdak Zamani, Touraj Khodadadi, and Azizah

AbdulManaf, "Efficient Intelligent Energy Routing Protocol in Wireless

Journal of Xi'an University of Architecture & Technology

Volume XII, Issue VI, 2020

ISSN No : 1006-7930

Page No: 649

Page 23: Energy Efficient Clustering and Data AggregationProtocol ...xajzkjdx.cn/gallery/61-june2020.pdf · Energy Efficient Clustering and Data AggregationProtocol using Machine Learning

Sensor Networks", Hindawi Publishing Corporation, International Journal of

Distributed Sensor Networks, 2015

17. Abdo M. T. Nasser, V. P. Pawar, "Machine Learning Approach for Sensors

Validation and Clustering", International Conference on Emerging Research

in Electronics, Computer Science and Technology – 2015

18. Deepak K. Sharma, Sanjay K. Dhurandher, Isaac Woungang, Rohit K.

Srivastava, Anhad Mohananey, and Joel J. P. C. Rodrigues, "A Machine

Learning-Based Protocol for Efficient Routing in Opportunistic Networks",

IEEE Systems Journal ( Volume: 12, Issue: 3, Sept. 2018 )

19. Gowtham Muniraju, Sai Zhang, Cihan Tepedelenlio?glu, Mahesh K.

Banavar, "Location Based Distributed Spectral Clustering for Wireless

Sensor Networks", Sensor Signal Processing for Defence Conference

(SSPD), 2017.

20. B. Krishnamachari, D. Estrin, and S.B. Wicker, “The Impact of Data

Aggregation in Wireless Sensor Networks,” Proc. 22nd Int’l Conf.

Distributed Computing Systems (ICDCSW ’02), pp. 575-578, 2002.

21. C. Intanagonwiwat, R. Govindan, D. Estrin, J. Heidemann, and F. Silva,

“Directed Diffusion for Wireless Sensor Networking,” IEEE/ACM Trans.

Networking, vol. 11, no. 1, pp. 2-16, Feb. 2003.

22. C. Intanagonwiwat, D. Estrin, R. Govindan, and J. Heidemann, “Impact of

Network Density on Data Aggregation in Wireless Sensor Networks,” Proc.

22nd Int’l Conf. Distributed Computing Systems, pp. 457-458, 2002.

23. E.F. Nakamura, H.A.B.F. de Oliveira, L.F. Pontello, and A.A.F. Loureiro,

“On Demand Role Assignment for Event-Detection in Sensor Networks,”

Proc. IEEE 11th Symp. Computers and Comm. (ISCC ’06), pp. 941-947,

2006.

24. S. Madden, M.J. Franklin, J.M. Hellerstein, and W. Hong, “Tag: A Tiny

Aggregation Service for Ad-Hoc Sensor Networks,” ACM SIGOPS

Operating Systems Rev., vol. 36, no. SI, pp. 131-146, 2002.

25. L.Y. Sun, X.X. Huang, W. Cai, Data aggregation of wireless sensor

networks using artificial neural networks. Chinese Journal of Sensors and

Actuators. 24(1), 122–127 (2011).

26. J.N. Aikaraki, R. Uimustafa, A.E. Kamal, Data aggregation and routing in

wireless sensor networks: optimal and heuristic algorithms. Comput. Netw.

53(7), 945–960 (2009).

27. J. Xu, J.X. Li, S. Xu, Data fusion for target tracking in wireless sensor

networks using quantized innovations and Kalman filtering. Science China:

Information science edition. 55(3), 530–544 (2012).

Journal of Xi'an University of Architecture & Technology

Volume XII, Issue VI, 2020

ISSN No : 1006-7930

Page No: 650

Page 24: Energy Efficient Clustering and Data AggregationProtocol ...xajzkjdx.cn/gallery/61-june2020.pdf · Energy Efficient Clustering and Data AggregationProtocol using Machine Learning

28. H. Li, H.Y. Yu, Research on data aggregation supporting QoS in wireless

sensor networks. Application Research of Computers. 25(1), 64–67 (2008).

29. G. Ahmed, N. M. Khan , Z. Khalid and R. Ramer, Cluster head selection

using decision trees for Wireless Sensor Networks, IEEE International

Conference on Intelligent Sensors, Sensor Networks and Information

Processing, 2008.

30. A. Forster and A. L. Murphy, CLIQUE: Role-Free Clustering with Q-

Learning for Wireless Sensor Networks,29th IEEE International Conference

on Distributed Computing Systems,2009.

31. G. Muniraju, S. Zhang, C. Tepedelenlioglu and M. K. Banavar Location

Based Distributed Spectral Clustering for Wireless Sensor Networks, IEEE

Sensor Signal Processing for Defence Conference (SSPD), 2017.

32. SangHak Lee and TaeChoong Chung, Data Aggregation for Wireless Sensor

Networks Using Self-organizing Map,Springer-Verlag Berlin Heidelberg

2005.

33. A. Morell, A. Correa and M. Barceló and J. L. Vicario, Data Aggregation and

Principal Component Analysis in WSNs,IEEE Transactions on Wireless

Communications, vol.15, Issue.6,PP.3908-3919, 2016.

Journal of Xi'an University of Architecture & Technology

Volume XII, Issue VI, 2020

ISSN No : 1006-7930

Page No: 651