Data Aggregation in Cluster-based Wireless Sensor...

DDaattaa AAggggrreeggaattiioonn iinn CClluusstteerr--bbaasseedd WWiirreelleessss SSeennssoorr NNeettwwoorrkkss

A Thesis Submitted in partial fulfillment

of the requirements for the award of the degree of

MASTER OF TECHNOLOGY in

INFORMATION TECHNOLOGY (Specialization: WIRELESS COMMUNICATION & COMPUTING)

by

Pranay Tiwari (IWC2006014)

Under the Guidance of: Prof. U. S. Tiwary

IIIT-Allahabad

INDIAN INSTITUTE OF INFORMATION TECHNOLOGY ALLAHABAD – 211 012 (INDIA)

July, 2008

Date: ___________________

I HEREBY RECOMMEND THAT THE THESIS PREPARED UNDER

OUR SUPERVISION BY Pranay Tiwari ENTITLED Data

Aggregation in Cluster-based Wireless Sensor Network BE

ACCEPTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS

FOR THE DEGREE OF MASTER OF TECHNOLOGY IN

INFORMATION TECHNOLOGY FOR EXAMINATION

Prof. U. S. Tiwary

(THESIS ADVISOR)

COUNTERSIGNED

Prof. U. S. Tiwary DEAN (ACADEMIC)

IINNDDIIAANN IINNSSTTIITTUUTTEE OOFF IINNFFOORRMMAATTIIOONN TTEECCHHNNOOLLOOGGYY AAllllaahhaabbaadd

(Deemed University) (A Centre of Excellence in Information Technology Established by Govt. of India)

CERTIFICATE OF APPROVAL* The foregoing thesis is hereby approved as a creditable study in

the area of Information Technology carried out and presented in a

manner satisfactory to warrant its acceptance as a pre-requisite to the

degree for which it has been submitted. It is understood that by this

approval the undersigned do not necessarily endorse or approve any

statement made, opinion expressed or conclusion drawn therein but

approve the thesis only for the purpose for which it is submitted.

COMMITTEE

ON

FINAL EXAMINATION

FOR EVALUATION

OF THE THESIS

*Only in case the recommendation is concurred in



DECLARATION

This is to certify that this thesis work entitled “Data Aggregation in

Cluster-based Wireless Sensor Networks’’ which is submitted by me in

partial fulfillment of the requirement for the completion of M.Tech. in

Information Technology specialization in Wireless Communication and

Computing to Indian Institute Of Information Technology, Allahabad

comprises only my original work and due acknowledgement has been made

in the text to all other materials used.

Pranay Tiwari

M.Tech. (WCC)

IWC200614



Abstract

The rapid advancement of hardware technology has enabled the development of small,

powerful, and inexpensive sensor nodes, which are capable of sensing, computation and

wireless communication. This revolutionizes the deployment of wireless sensor network

for monitoring some area and collecting regarding information. However, limited energy

constraint presents a major challenge such vision to become reality.

We consider energy constrained wireless sensor network deployed over a region. The

main task of such a network is to gather information from node and transmit it to base

station for further processing. So the aim of any data forwarding protocol is to conserve

energy to maximize the network lifetime. Sensor nodes are capable of performing in-

network aggregation of data coming from more than one source.

In this thesis we have concentrated on energy consumption issue and aim to develop an

energy efficient data aggregation protocol. To provide energy efficiency we have

considered a cluster-based wireless sensor network. Our protocol executes on each

cluster independently and provides an energy efficient data aggregation in a cluster and

hence maximize network lifetime for whole network.

Acknowledgements

Before, I get into thick of things; I would like to add a few heartfelt words for the people

who were part of my thesis in numerous ways, people who gave unending support right

from the beginning. During this period, the faculty members and my batch mates took

keen interest and participated actively. They are very efficient and qualified in their

respective disciplines.

I express my sincere gratitude to my thesis supervisor Prof. U.S. Tiwary, Indian

Institute of Information Technology-Allahabad for all his affectionate encouragement

and guidance during the entire Thesis. His views and inputs are very helpful throughout

the process.

I would like to thank Dr. Shirshu Verma, Indian Institute of Information

Technology-Allahabad, who suggested many related points and is always very

constructive and helpful.

I would like to thank Dr. M.D. Tiwari, Hon’ble Director, Indian Institute of

Information Technology-Allahabad for the facilities and environment for research.

Not the least I would like to appreciate the support and suggestions of my dear friend

Neelam who continuously encouraged me during the progress of my thesis work and my

M.Tech friends Anuraag, Koustubh, Lalit, and Satish for everyday chatting we had on

several topics.

Lastly I would like to thank my family for their love, support and encouragement that

they have given me throughout my life, helping me to persevere in my studies. In

addition, I thank almighty god who made me a normal human being and gave me such

strength.

List of Figures

Figure 2.1 Simplified Schematic of Directed Diffusion 12

Figure 2.2 E-Span protocol 14 Figure 2.3 Example of Aggregation path over ring structure 15 Figure 2.4 Illustration of Two-phase clustering 17

Figure 2.5 Data transmission using ESPDA 18 Figure 3.1 A typical scenario of data aggregation in a cluster 23 Figure 3.2 Parent selection procedure 24 Figure 4.1 Random nodes scenario in NAM 29 Figure 4.2 Snapshot of console 32 Figure 4.3 Data transfer between node 0 and 10 through node 1 & 2 33 Figure 4.4 Energy level of nodes drop to first threshold 34 Figure 4.5 Energy level of nodes drop to second threshold 35 Figure 4.6 Residual energy of source as a function of time 36 Figure 4.7 Throughput as function of time 37 Figure 4.8 Effect of network density 38 Figure 4.9 Packet delivery ratio of the network 39 Figure A.1 The basic structure of NS-2 43

Table of Contents

CERTIFICATE OF APPROVAL DECLARATION ABSTRACT ACKNOWLEDGEMENTS LIST OF FIGURES Chapter 1 INTRODUCTION …………………………………………….. 1

Wireless Sensor Networks …………………………………….. 1 1.1.1 Sensor Network Challenges ………………………………… 2 1.1.2 Wireless Sensor Network vs. Traditional Wireless Network…. 4

1.1

1.1.3 Clustering in WSN ………………………………………….. 5 1.2 Motivation ……………………………………………………... 6 1.3 Problem Definition …………………………………………….. 7 1.4 Report Organization …………………………………………... 8

Chapter 2 DATA AGGREGATION: AN OVERVIEW ………………... 9 2.1 In-Network Aggregation ……………………………………… 9 2.2 Tree-Based Approaches ………………………………………. 11 2.3 Multi-path Approaches ……………………………………….. 14 2.4 Cluster-based Approaches …………………………………..... 16 2.5 Simulation Tools ………………………………………………. 18 2.6 Summary ………………………………………………………. 19

Chapter 3 ENERGY-AWARE BALANCED IN-NETWORK AGGREGATION………………………………………………

20

3.1 Introduction ……………………………………………............. 20

3.2 System & Energy Model ……………………………………… 21

Protocol Description …………………………………………... 22

3.3.1 Configuration Packet Flow …………………………………… 23

3.3

3.3.2 Data Packet Flow ……………………………………………... 24

3.4 Issues …………………………………………………………… 26

3.5 Summary ………………………………………………………. 26

Chapter 4 PERFORMANCE ANALYSIS ………………………………. 28

4.1 Simulation Analysis …………………………………………… 28

4.1.1 Simulation Setup ……………………………………………. 28

4.1.2 Simulation Run ………………………………………........... 33

4.1.3 Simulation Results …………………………………….......... 36

Chapter 5 CONCLUSION & FUTURE WORK ………………………... 40

Appendix A Network Simulator 2 …………………………………………............ 42

References ……………………………………………………...................................... 47

Data Aggregation in Cluster-based Wireless Sensor Networks

Chapter 1

Introduction

Before giving any outline of this thesis, we are going to introduce you to the

world of wireless sensor networks. We will briefly describe challenges and application of

wireless sensor networks. We will also provide you the motivation behind the selection of

the data aggregation issue for this thesis and also describe what exactly the data

aggregation is. Finally we will conclude this with providing outline of the thesis report.

1.1 Wireless Sensor Networks

A wireless sensor network is a wireless network consisting of tiny devices which

monitor physical or environmental conditions such as temperature, pressure, motion or

pollutants etc. at different regions. The tiny device, known as sensor node, consists of a

radio transceiver, microcontroller, power supply, and the actual sensor. Initially sensor

network were used for military applications but now they are widely used for civilian

application area including environment and habitat monitoring, healthcare application

and so on.

Normally sensor nodes are spatially distributed throughout the region which has

to be monitored; they self-organize in to a network through wireless communication, and

collaborate with each other to accomplish the common task. With the going time, sensor

nodes are becoming smaller, cheaper, and more powerful which enable us to deploy a

large-scale sensor network.

Basic features of sensor networks are self-organizing capabilities, dynamic

network topology, limited power, node failures & mobility of nodes, short-range

M.Tech. - Thesis (July 2008) Indian Institute of Information Technology, Allahabad 1


broadcast communication and multi-hop routing, and large scale of deployment [12]. The

strength of wireless sensor network lies in their flexibility and scalability. The capability

of self-organize and wireless communication made them to be deployed in an ad-hoc

fashion in remote or hazardous location without the need of any existing infrastructure.

Through multi-hop communication a sensor node can communicate a far away node in

the network. This allows the addition of sensor nodes in the network to expand the

monitored area and hence proves its scalability & flexibility property.

Presently there are different types of commercially available sensor nodes.

University of California at Berkeley has developed Mica mote which is a special purpose

sensor node. Other special purpose sensor nodes available are Spec, Rene, Mica 2, Telos

etc. Some high bandwidth sensor nodes available are BTNode, Imote 1.0, Stargate,

Inryonc Cerfeube etc. [13].

1.1.1 Sensor Network Challenges

Wireless sensor network promise a wide variety of application and to realize these

application in real world, we need more efficient protocols and algorithms. Designing a

new protocol or algorithm address some challenges which are need to be clearly

understood. These challenges are summarized below:

Physical Resource Constraints: The most important constraint

imposed on sensor network is the limited battery power of sensor nodes. The

effective lifetime of a sensor node is directly determined by its power supply.

Hence lifetime of a sensor network is also determined by the power supply.

Hence the energy consumption is main design issue of a protocol. Limited

computational power and memory size is another constraint that affects the

amount of data that can be stored in individual sensor nodes. So the protocol

should be simple and light-weighted. Communication delay in sensor network



can be high due to limited communication channel shared by all nodes within

each other’s transmission range.

Ad-hoc Deployment: Many application or most of them requires the ad-hoc

deployment of sensor nodes in the region. Sensor nodes are randomly deployed

over the region without any infrastructure which requires the system to be able to

cope up with random distribution and form connection between the nodes. As an

example, for fire detection in a forest the nodes typically would be dropped in to

the forest from a plane.

Fault-Tolerance: In a hostile environment, a sensor node may fail due to

physical damage or lack of energy (power). If some nodes fail, the protocols that

are working upon must accommodate these changes in the network. As an

example, for routing or aggregation protocol, they must find suitable paths or

aggregation point in case of these kinds of failures.

Scalability: In a region, depending upon the application, the number of sensor

nodes deployed could be in order of hundreds, thousands or more. The protocols

must scalable enough to respond and operate with such large number of sensor

nodes.

Quality of Service: Some sensor application are very time critical which

means the data should be delivered within a certain period of time from the

moment it is sensed, otherwise the data will be careless. So this could be a QOS

parameter for some applications.



1.1.2 Wireless Sensor Networks vs. Traditional Wireless

Networks

Though many existing protocol, techniques and concepts from traditional wireless

network, such as cellular network, mobile ad-hoc network, wireless local area network

and Bluetooth, are applicable and still used in wireless sensor network, but there are also

many fundamental differences which lead to the need of new protocols & techniques.

Some of the most important characteristic differences are summarized below:

Number of nodes in wireless sensor network is much higher than any traditional

wireless network. Possibly a sensor network has to scale number of nodes to

thousands. Moreover a sensor network might need to extend the monitored area

and has to increase number of nodes from time to time. This needs a highly

scalable solution to ensure sensor network operations without any problem.

Due to large number of sensor nodes, addresses are not assigned to the sensor

nodes. Sensor networks are not address-centric; instead they are data-centric

network. Operations in sensor networks are centered on data instead of individual

sensor node. As a result sensor nodes require collaborative efforts.

Wireless sensor networks are environment-driven. While data is generated by

humans in traditional networks, the sensor network generate data when

environment changes. As a result the traffic pattern changes dramatically from

time to time.

Another characteristic unique to wireless sensor network is the correlated data

problem. Data collected by neighboring sensor nodes are often quite similar

which makes possible to the development of routing and aggregation techniques

that can reduce redundancy and improve energy efficiency. It also been observed



that the environmental quantities changes very slow and some consecutive

readings sense temporally correlated data. This advantageous feature can be

exploited to develop an energy efficient data gathering and aggregation

techniques.

1.1.3 Clustering in WSN

It is widely accepted that the energy consumed in one bit of data transfer can be

used to perform a large number of arithmetic operations in the sensor processor [13].

Moreover in a densely deployed sensor network the physical environment would produce

very similar data in close-by sensor nodes and transmitting such data is more or less

redundant. Therefore, all these facts encourage using some kind of grouping of nodes

such that data from sensor nodes of a group can be combined or compressed together in

an intelligent way and transmit only compact data. This can not only reduce the global

data to be transmitted and localized most traffic to within each individual group, but

reduces the traffic and hence contention in a wireless sensor network. This process of

grouping of sensor nodes in a densely deployed large-scale sensor network is known as

clustering. The intelligent way to combined and compress the data belonging to a single

cluster is known as data aggregation.

There are some issues involved with the process of clustering in a wireless sensor

network. First issue is, how many clusters should be formed that could optimize some

performance parameter. Second could be how many nodes should be taken in to a single

cluster. Third important issue is the selection procedure of cluster-head in a cluster.

Another issue that has been focused in many research papers is to introduce heterogeneity

in the network. It means that user can put some more powerful nodes, in terms of energy,

in the network which can act as a cluster-head and other simple node work as cluster-

member only. Considering the above issues, many protocols have been proposed which

deals with each individual issue.



1.2 Motivation

Wireless sensor network are new age technology which can be used at a place

which is possibly hostile & human inaccessible. Wireless sensor network is a class of

wireless network which consists of thousands of densely deployed sensor nodes which

can be used for a number of applications. Sensor nodes are tiny devices which are

composed of a sensing unit, a radio, a processor & a limited battery power. A network of

thousands of sensor nodes could be setup for many applications such as environmental

monitoring, health monitoring, disaster management, industrial areas, military application

and many more.

In wireless sensor network, there are so many challenges & issues as above

already been discussed. The main challenges are how to provide maximum lifetime to

network and how to provide robustness to network. As sensor network totally rely on

battery power, the main aim for maximizing lifetime of network is to conserve battery

power or energy.

In sensor network, the energy is mainly consumed for three purposes: data

transmission, signal processing, and hardware operation. It is said in [4] that 70% of

energy consumption is due to data transmission. So for maximizing the network lifetime,

the process of data transmission should be optimized. The data transmission can be

optimized by using efficient routing protocols and effective ways of data aggregation.

Routing protocols have their own ways to save energy of nodes in the network by

providing or creating an optimal route from sensor nodes to base station or sink. Data

aggregation plays an important role in energy conservation of sensor network. Data

aggregation methods are used not only for finding an optimal path from source to

destination but also to eliminate the redundancy of data, since transmitting huge volume

of raw data is an energy intensive operation, and thus minimizing the number of data

transmission. Also multiple sensors may see the same phenomenon, albeit from different

view and if this data can be reconciled into a more meaningful form as it passes through

the network, it becomes more useful to an application. One more benefit of data



aggregation is that if data is processed as it is passed through the network, it may be

compressed thus occupying less bandwidth. This also reduces the amount of transmission

power expended by nodes. Hence data aggregation can be considered as a very

challenging problem in wireless sensor network.

1.3 Problem Definition

Data aggregation protocols aims at eliminating redundant data transmission and

thus improve the lifetime of energy constrained wireless sensor network. In wireless

sensor network, data transmission took place in multi-hop fashion where each node

forwards its data to the neighbor node which is nearer to sink. That neighbor node

performs aggregation function and again forwards it on. But performing data forwarding

and aggregation in this fashion from various sources to sink causes significant energy

waste as each node in the network is involved in operation. Since closely placed nodes

may sense same data, above approach cannot be considered as energy efficient. An

improvement over the above approach would be clustering where each node sends data to

cluster-head (CH) and then cluster-head perform aggregation on the received raw data

and then send it to sink. Performing aggregation function over cluster-head still causes

significant energy wastage. In case of homogeneous sensor network cluster-head will

soon die out and again re-clustering has to be done which again cause energy

consumption.

We would like to present an algorithm that performs data aggregation within a

cluster and thus reducing the load of aggregation at cluster-head to provide energy

efficiency for maximizing network lifetime.



1.4 Report Organization

With the end of chapter 1, the whole report is organized as follows:

Chapter 2 gives a detailed overview of data aggregation. This chapter also presents the

literature survey that had been done.

Chapter 3 introduces and describes the new proposed protocol for data aggregation in

cluster-based wireless sensor networks.

Chapter 4 will present the performance analysis of the proposed protocol. It will also

provide you the comparison results.

Conclusion is given in the last chapter 5 and scope of future enhancements is also

incorporated.

Appendix – A describes the network simulator NS-2 and its usability to simulate

proposed protocol.

Materials (e.g. URLs, books, and research papers) used and studied are given in

Reference.



Chapter 2

Data Aggregation: An Overview

Wireless sensor network have recently been the focus of many research efforts

and emerged as an important new area in wireless technology. As chapter 1 already

discuss about the problem of data aggregation, this chapter presents an overview of data

aggregation. This chapter first discusses in-network aggregation and then different

approaches that are widely used for data aggregation.

2.1 In-Network Aggregation

In a typical sensor network scenario, different node collect data from the

environment and then send it to some central node or sink which analyze and process the

data and then send it to the application. But in many cases, data produced by different

node can be jointly processed while being forwarded to the sink node. So in-network

aggregation deals with this distributed processing of data within the network.

Data aggregation techniques explore how the data is to be routed in the network

as well as the processing method that are applied on the packets received by a node. They

have a great impact on the energy consumption of nodes and thus on network efficiency

by reducing number of transmission or length of packet. Elena Fosolo et al in [7] defines

the in-network aggregation process as follows: “In-network aggregation is the global

process of gathering and routing information through a multi-hop network, processing

data at intermediate nodes with the objective of reducing resource consumption (in

particular energy), thereby increasing network lifetime.”



There are two approaches for in-network aggregation: with size reduction and

without size reduction. In-network aggregation with size reduction refers to the process

of combining & compressing the data packets received by a node from its neighbors in

order to reduce the packet length to be transmitted or forwarded towards sink. As an

example, consider the situation when a node receives two packets which have a spatial

correlated data. In this case it is worthless to send both packets. Instead of that one should

apply any function like AVG, MAX, MIN and then send a single packet. This approach

considerably reduces the amount of bits transmitted in the network and thus saving a lot

of energy but on the other hand, it also reduces the precision of value of data received. In-

network aggregation without size reduction refers to the process merging data packets

received from different neighbors in to a single data packet but without processing the

value of data. As an example, two packets may contain different physical quantities (like

temperature & humidity) and they can be merged in to a single packet by keeping both

values intact but keeping a single header. This approach preserves the value of data and

thus transmit more bits in the network but still reduce the overhead by keeping single

header.

This of the two approaches to use depends on many factors like the type of

application, data rate, network characteristics and so on. There is also a trade-off between

energy consumption and precision of data for the two approaches.

Most of the work done till now on in-network aggregation mainly deals with

problem of forwarding packets from source to sink, to facilitate aggregation therein.

Actually the main idea behind were to enhance existing routing protocols such that they

can efficiently aggregate data. So till now, most of the data aggregation techniques fall

under three categories. They are tree-based approaches, multi-path approaches, and

cluster-based approaches. There also some hybrid approaches that combines any of the

three techniques above. So, all the three approaches will be described in coming sections

with giving details of some of the main techniques by different authors.



2.2 Tree-Based Approach

The simplest way to aggregate data is to organize the nodes in a hierarchical

manner and then select some nodes as the aggregation point or aggregators. The tree-

based approach perform aggregation by constructing an aggregation tree, which could be

a minimum spanning tree, rooted at sink and source nodes are considered as leaves. Each

node has a parent node to forward its data. Flow of data starts from leaves nodes up to the

sink and therein the aggregation done by parent nodes. The way this approach operates

has some drawbacks. As we know like any wireless network the wireless sensor networks

are also not free from failures. In case of packet loss at any level of tree, the data will be

lost not only for a single level but for whole related sub-tree as well. In spite of high cost

for maintaining tree structure in dynamic networks and scarce robustness of the system,

this approach is very much suitable for designing optimal aggregation technique and

energy-efficient techniques.

S. Madden et al. in [14] proposed a data-centric protocol which is based on

aggregation tress, known as Tiny Aggregation (TAG) approach [14]. TAG works in two

phases: distribution phase and collection phase. In distribution phase, TAG organizes

nodes in to a routing tree rooted at sink. The tree formation starts with broadcasting a

message from sink specify level or distance from root. When a node receive this message

it sets its own level to be the level of message plus one and elect parent as node from

which it receives the message. After that, node rebroadcast this message with its own

level. This process continues until all nodes elect their parent. After tree formation, sink

send queries along structure to all nodes in the network. TAG uses database query

language (SQL) for selection and aggregation functions. In collection phase, data is

forwarded and aggregated from leaves nodes to root. A parent node has to wait for data

from all its child node before it can send its aggregate up the tree. Apart from the simple

aggregation function provided by SQL (eg: COUNT, MIN, MAX, SUM, and AVG),

TAG also partitions aggregates according to the duplicate sensitivity, exemplary and

summary, and monotonic properties. Though TAG periodically refresh tree structure of



network but as most of the tree-based schemes are inefficient for dynamic network, so

TAG may be.

C. Intanagonwiwat et al. in [3] proposed a reactive data-centric protocol for

applications where sink ask some specific information by flooding, known as directed

diffusion paradigm. The main idea behind directed diffusion paradigm is to combine data

coming from different source and en-route them by eliminating redundancy, minimizing

the number of data transmission; thus maximizing network lifetime. Directed diffusion

consists of several elements: interests, data messages, gradients, and reinforcements.

Figure 2.1 Simplified schematic for directed diffusion. (a) Interest propagation. (b) Initial

gradients setup. (c) Data delivery along reinforced path [3].

The base station (BS) requests data by broadcasting an interest message which

contains a description of a sensing task. This interest message propagates through the

network hop-by-hop and each node also broadcast interest message to its neighbor. As

interest message propagates throughout the network, gradients are setup by every node

within the network. The gradient direction is set toward the neighboring node from which

the interest is received. This process continues until gradients are setup from source node

to base station. Loops are not checked at this stage but removed at later stage. After this

path of information flow are formed and then best path are reinforced to prevent further

flooding according to a local rule. Data aggregation took place on the way of different

paths from different sources to base station or sink. The base station periodically refresh

& resend the interest message as soon as it start to receives data from sources to provide

reliability. The problem with directed diffusion is that it may not be applied to

applications (e.g. environmental monitoring) that require continuous data delivery to base



station. This is because query driven on demand data model may not help in this regard.

Also matching data to queries might require some extra overhead at the sensor nodes.

Mobility of sink nodes can also degrade the performance as path from sources to sinks

cannot be updated until next interest message is flooded throughout the network. To cope

up with above issue if introduce frequent flooding then also too much overhead of

bandwidth and battery power will be introduced. Furthermore, exploratory data follow all

possible paths in the network following gradients which lead to unnecessary

communications overhead.

M. Lee et al. in [2] proposed a new low-control-overhead data dissemination

scheme, which they called as pseudo-distance data dissemination (PDDD), for efficiently

disseminating data from all sensor nodes to mobile sink. Some assumption have been

made, they are: (1) all source nodes maintain routes to mobile sink node, (2) no

periodically messaging for topological changes due to mobile sink node, (3) all link are

bi-directional and no control messages are lost, (4) mobile sink nodes have unlimited

battery power, so no need to care about battery efficiency of sink node, and (5) network

partitioning is not considered. Data dissemination process is influenced by directed

diffusion [3]. Though mobile sink periodically broadcast interest message, sensor nodes

do not send exploratory data and do not wait reinforcement message because each sensor

node already has routes to the sink node. After getting interest message, adjacent nodes

set a parent-child relationship using pseudo-distance of each node and finally a partial

ordered graph (POG) has been build. Optimal data dissemination is achieved in terms of

path length by forwarding packets to a parent node until topology is unchanged. Then

each sensor node is assigned a level for a corresponding sink node with pseudo-distance.

In order to overcome the shortcoming of POG, author used totally ordered graph (TOG)

in place of POG. The problem identified in this approach is that due to mobility of sink

node all sensor nodes have to maintain routes and for any change in topology nodes have

to again change route accordingly which led to energy waste.

Marc Lee et al. in [8] proposed an energy-aware spanning tree algorithm for data

aggregation, referred as E-Span. E-Span is a distributed protocol in which source node

that has highest residual energy is chosen as root. Other source nodes choose their parent



based on residual energy and distance to the root. The protocol uses configuration

message to exchange information of node i.e., residual energy and distance to the root.

Each node performs single-hop broadcast operation to send packets. Single-hop broadcast

refers to the operation of sending a packet to all single-hop neighbors [8].

Figure 2.2 E-Span protocol [8].

2.3 Multi-path Approach

One of the main drawbacks of tree-based approach is the scarce robustness of the

system. To overcome this drawback, a new approach was proposed by many researchers.

Instead of sending partially aggregated data to a single parent node in aggregation tree, a

node could send data over multiple paths. The idea behind is that each node can send the

data to its possibly multiple neighbors by exploiting the wireless medium characteristic.

Hence data will flow from sources to sink along multiple paths and aggregation can be

performed by each intermediate node. Clearly schemes using this approach will make the

system robust but with some extra overhead. One of the aggregation structures that fit

well with this approach is ring topology, where network is divided in to concentric circles

with defining levels according to the hop distance from sink.

S. Nath et al. in [15] presented a data aggregation technique using multi-path

approach, known as synopsis diffusion. Synopsis diffusion works in two phases:

distribution of queries and data retrieval phase. During distribution of queries phase, a

node sends a query in the network. The network nodes then form a set of rings around the



querying node. The node which is i hop away from querying node is considered is ring

Ri. In the second phase, aggregation starts from outermost ring and propagate level by

level towards the sink. Here a source node can have multiple paths towards sink.

Figure 2.3 Examples of aggregation paths over a ring structure [7].

L. Gatani et al. in [5] describe a new strategy for data gathering in wireless sensor

network that consider both issues: energy efficiency and robustness. Authors first say that

single path to connect each node to the base station is simple & energy-saving approach

but expose a high risk of disconnection due to node/link failures. But multi-path approach

would require more nodes to participate with consequent waste of energy. Authors

present a clever use of multi-path only when there is loss of packet which is implemented

by smart caching of data at sensor nodes. Authors also argue that in many practical

situation data may be gathered only from a particular region, so they use a different

approach that relies on a spanning tree and provides alternative paths only when a

malfunctioning is detected. Algorithm adopts a tree-based approach for forwarding

packets through the network. In the ideal situation when no failures occur, this is

certainly the best choice, as the minimum number of nodes is engaged in the transmission

phase. In the presence of link or node failures, the algorithm will discover alternative



paths, so as ensure the delivery of as many packets as possible within the time

constraints. The problem with this approach is that it may cause the arising of hot spots

and nodes along preferred paths will consume their energy resources quickly, possibly

causing disconnection in the network.

2.4 Cluster-Based Approach

We talked about hierarchical organization of the network in tree-based approach.

Another scheme to organize the network in hierarchical manner is cluster-based

approach. In cluster-based approach, whole network is divided in to several clusters.

Each cluster has a cluster-head which is selected among cluster members. Cluster-heads

do the role of aggregator which aggregate data received from cluster members locally and

then transmit the result to sink. The advantages and disadvantages of the cluster-based

approaches is very much similar to tree-based approaches.

K. Dasgupta et al. in [16] proposed a maximum lifetime data aggregation

(MLDA) algorithm which finds data gathering schedule provided location of sensors and

base-station, data packet size, and energy of each sensor. A data gathering schedule

specifies how data packet are collected from sensors and transmitted to base station for

each round. A schedule can be thought of as a collection of aggregation trees. In [6], they

proposed heuristic-greedy clustering-based MLDA based on MLDA algorithm. In this

they partitioned the network in to cluster and referred each cluster as super-sensor. They

then compute maximum lifetime schedule for the super-sensors and then use this

schedule to construct aggregation trees for the sensors.

W. Choi et al. in [1] present a two-phase clustering (TPC) scheme. Phase I of this

scheme creates clusters with a cluster-head and each node within that cluster form a

direct link with cluster-head. Phase I of this scheme is similar to various scheme used for

clustering but differ in one way that the cluster-head rotation is localized and is done

based on the remaining energy level of the sensor nodes which minimize time variance of

sensors and this lead to energy saving from unnecessary cluster-head rotation. In phase II,



each node within the cluster searches for a neighbor closer than cluster-head which is

called data relay point and setup up a data relay link. Now the sensor nodes within a

cluster either use direct link or data relay link to send their data to cluster head which is

an energy efficient scheme. The data relay point aggregates data at forwarding time to

another data relay point or cluster-head. In case of high network density, TPC phase II

will setup unnecessary data relay link between neighbors as closely deployed sensor will

sense same data and this lead to a waste of energy.

Figure 2.4 Illustration of Two Phase Clustering [1].

H. Cam et al. in [4] present energy efficient and secure pattern based data

aggregation protocol which is designed for clustered environment. In conventional

method data is aggregated at cluster-head and cluster-head eliminate redundancy by

checking the content of data. This protocol says that instead of sending raw data to

cluster-head, the cluster members send corresponding pattern codes to cluster-head for

data aggregation. If multiple nodes send the same pattern code then only one of them is

finally selected for sending actual data to cluster-head. For pattern matching, authors

present a pattern comparison algorithm.



Figure 2.5 Data transmission using ESPDA [4].

2.5 Simulation Tools

We have plenty of simulation tools or simulators for simulating wireless

networks. The simulators which are most popular are NS-2, OPNET, OMNet++, J-Sim,

GlomoSim, Qualnet, TOSSIM and so on. Since wireless sensor networks are special type

of wireless networks, most of the simulators available are not enough supported for

simulating a wireless sensor network scenario. The literature shows that the simulators

which are mostly used for wireless sensor network are NS-2, J-Sim, GlomoSim, OPNET,

TOSSIM, PROWLER and even MATLAB is also used.

NS-2 is the most popular and powerful simulator. NS-2 is an object-oriented

discrete time event simulator and its modular design made it to be extensible. The detail

of NS-2 is provided in appendix A. To simulate sensor network, there had been made

attempt to put some add-ons. The most appreciable extension to NS-2 for wireless sensor

networks was developed in 2004 by Ian Downward of Naval Research Laboratory (NRL)

[18]. In this extension, they had created phenomenon packet which trigger event for

sensor nodes. They also designed sensor agent, sensor application and other application

to support wireless sensor network. But there has been no improvement or work done

since 2006. This is the reason why all research is going by using NS-2 without any

extension support. Other simulators like J-Sim, GlomoSim, OPNET, OMNet++ are



exceptionally used by researchers. Another simulators which is MATLAB based, known

as PROWLER, is sometimes used for simulating routing protocol or some MAC issues in

wireless sensor network. MATLAB is also used for simulating some physical layer

issues. But still there is no efficient simulator which is purely dedicated to wireless sensor

network. More details for comparative study on simulator for wireless sensor network

could be found in [19].

2.6 Summary

In-network data aggregation is the process of gathering data from all nodes and

processes them at intermediate node while forwarding towards sink.

Tree-based approaches construct a minimum spanning aggregation tree rooted at

sink and covering all nodes in the network.

In multi-path approaches a node choose more than one parent for forwarding data

so as to have multiple paths towards sink to make the system robust.

Cluster-based approaches organize the network in to several clusters each with a

cluster-head which responsible for aggregating data locally before forwarding

towards sink.

Many simulators are used by researchers but NS-2 is the most efficient and

widely used.



Chapter 3

Energy-aware Balanced In-Network Aggregation

After the detailed description of wireless sensor networks, problem of data

aggregation for maximizing network lifetime, and existing aggregation techniques in

chapter 1 and 2, we now present our proposed protocol for solving data aggregation

problem. This chapter first gives you an introduction of our protocol “Energy-aware

Balanced In-Network Aggregation (E-BINA)” and then briefly describes the system and

energy model that has been considered for designed protocol. After this a detailed

description of the algorithm will be presented. As there are some issues involved with

every protocol, so with our proposed protocol also. These issues will be presented in brief

and finally we summarize this chapter.

3.1 Introduction

The three broad categories of data aggregation techniques that we have described

in previous chapter are: tree-based approach, multi-path approach, and cluster-based

approach. Focusing on cluster-based approach, we found that the existing protocols

assume a sensor network which is divided in to several clusters. Depending upon the

protocol operation, each cluster-head receives the data packets from some cluster-

member or from all cluster-member nodes directly and then cluster-head perform

aggregation operation.

Taking the advantageous features of tree-based approach, we have designed our

protocol which takes the merits of both cluster-based and tree-based approach. E-BINA

assumes a cluster-based wireless sensor network and applies tree-based approach inside

each cluster. When a cluster is formed and cluster-head selected, it consider cluster-head



as root and construct an aggregation tree over cluster-member nodes. The process of

aggregation tree construction requires the sensor nodes to reduce their transmission

power as sensor nodes now have to send their data packets to the neighbor node which is

selected as parent. Energy consumed in wireless transmission is directly proportional to

the square of the distance between nodes in communication [13]. Since cluster-member

node now sends their data packets to the neighbor node instead of cluster-head, the

transmission distance is reduced and hence the energy consumption of the sensor node.

Likewise, overall energy consumption of sensor nodes in a cluster is reduced and so for

the whole sensor network. Hence overall network lifetime will be increased.

Energy-aware Balanced In-Network Aggregation (E-BINA) protocol is energy-

aware as it has taken the residual energy of sensor node in to consideration while

constructing the aggregation tree. The protocol also balances the network load by

selecting different parent for a node according to the energy level remain in the sensor

node during the aggregation tree construction process. Each parent node performs

aggregation of data packet that it receives from its child nodes and hence the protocol

justifies the in-network aggregation concept.

3.2 System & Energy Model

Consider a homogeneous network of n sensor nodes and a base station or sink

node distributed over a region. The location of the sensors and the base station are fixed

and known priori. Each sensor produces some information as it monitors its vicinity. We

assume that the whole network is divided in to several clusters; each cluster has a cluster-

head (CH). The clustering and the selection of cluster-head (CH) can be done by using

any existing protocol like LEACH, such that cluster-head (CH) is maximum k-hop away

from any node in cluster. We also assume that after the formation of cluster the

transmission power of all nodes is adjusted in such a way that they can perform single

hop broadcast. Single hop broadcast refers to the operation of sending a packet to all

single-hop neighbors [8].



Our energy model for the sensors is based on the first order radio model

described in [17]. A sensor consumes Eelec = 50nJ/bit to run the transmitter or receiver

circuitry and Eamp = 100pJ/bit/m2 for the transmitter amplifier. Thus, the energy

consumed by a sensor i in receiving a l-bit data packet is given by,

ERxi = Eelec . l (1)

while the energy consumed in transmitting a data packet to sensor j is given by,

ETxi,j = Eelec . l + Eamp . di,j2 . l (2)

where di,j is the distance between nodes i and j.

3.3 Protocol Description

In a cluster-based wireless sensor network, our algorithm is designed to provide

energy-aware in-network data aggregation in a cluster. Each cluster uses this algorithm

independently. In a cluster, the nodes can be categorized as: one cluster-head (CH) and

other cluster member node.

Function of cluster-head (CH)

1. Receive a query from base station.

2. Cluster-head (CH) sends configuration packets to all single-hop neighbors.

3. Receive data packets from all single hop neighbors.

4. Finally aggregate the data packets received and route it to base station.

Function of cluster member

1. Receive configuration packets from neighbor nodes.

2. Update and forward configuration packets to all single-hop neighbors.

3. Receive data packets from neighbor nodes.

4. Aggregate all data packets by applying redundancy factor and send it to selected

parent node.



The algorithm works in two phases: Configuration packet flow and Data packet flow that

are described below.

Initial cluster position Flow of configuration packets Flow of data packets

through selected parent

Figure 3.1 A typical scenario of data aggregation in a cluster.

3.3.1 Configuration Packet Flow

Initially cluster-head broadcast configuration packet to all its neighbors.

Configuration packet contains the following fields:

Node Id location of node that each node know in prior

Hop Distance distance from cluster-head in terms of hop count (set zero for CH)

Residual Energy current energy in node

Each node upon receiving the broadcast configuration packet that is originated

from cluster-head adds the sender of the packet in the list of its possible parents with its

node id, hop distance, residual energy. After this the node again broadcast the

configuration packet to all its neighbors by updating node id to its own id, incrementing

hop distance by one and its own residual energy. This process continues until all the



nodes in cluster receive configuration packet. All nodes that broadcast the configuration

packet do so by predefined and common signal strength that is know to all the nodes.

3.3.2 Data Packet Flow

When all nodes receives configuration packets, each node now select the parent to

which it can forward the data packet. The parent selection procedure is shown in fig. 3.2.

Each node looks in to the list of all its possible parents. The neighbor node which

has least hop distance, ie closest to cluster-head, is selected as parent by a node. In case

when two neighbor nodes have the least but equal hop distance, the node checks the

residual energy of two neighbor nodes. The neighbor node that has greater residual

energy is now selected as parent. In both the cases, node also calculate the difference of

residual energy of two neighbor nodes, which have least hop distance, and checks

whether this difference is less than the threshold or not. If it is then the node selects the

parent as usual. But if it is not then the node selects other neighbor node as its parent.

Define: Er[i]: residual energy of node i dh[i]: distance from CH in terms of hop count of node i Ed

ij: difference of residual energy of two nodes i & j te: threshold of residual energy difference of two nodes i & j for balancing load nid: id of a node S: set of configuration packets received by node i ParentSelection(nid)

1 select two nodes j & k such that dh[j] & dh[k] is minimal in S 2 if (dh[j] < dh[k] AND |Ed

jk| < te) 3 then return nid of node j 4 else if (dh[j]==dh[k] AND |Ed

jk| < te) 5 then return nid of node with max(Er[j] , Er[k]) 6 endif 7 else return nid of node k 8 endif

Figure 3.2 Parent selection procedure.



This allows a node that has more available resources to be selected as a parent node. This

also balances the consumption of energy of nodes in the cluster and leads to die out of

nodes nearly at same time.

After selecting the parent node, each node now forwards its data to its parent.

When a parent node receives multiple data packets from its neighbor nodes, it performs

aggregation operation by eliminating redundancy in the data. Each parent node checks

the equation below:

| VNi – VNj | < K (3)

where, VNi data value of node i

VNj data value of node j

K redundancy factor

If this equation satisfies, the parent node perform aggregation by applying any

aggregation functions like MIN, MAX, and AVG on the values of data packet and send

only one packet while discarding other packets. But if this equation do not satisfies, the

parent performs aggregation by simply concatenating two data packet in to one keeping

value of both packets intact.

The selection of value for redundancy factor (K) has a trade-off between precision

and energy consumption. If the application wants more precision, it should select a low

value for redundancy factor otherwise a high value. Selecting high value for K means

sending only one value thus less number of bits needs to be transmitted and hence low

energy consumption.



3.4 Issues

E-BINA significantly reduces the energy consumption of all nodes in the cluster

by reducing the transmission power of all nodes. The one issue that arises in our designed

protocol is that after the formation of cluster and selection of cluster-head, all nodes have

to reduce their transmission power. All nodes have to reduce their transmission power in

such a way that they could only reach their single-hop distance neighbors. This operation

requires some kind of synchronization among all nodes. The nodes have to program

before to perform the above task. For this, the programming task needs little extra effort.

Now when cluster-head received all data packets and aggregated them, it has to now

increase its transmission power so that it can transmit the final aggregated data up in the

cluster-head hierarchy towards the sink. Another issue that remains with any tree based

approach is robustness of the system. In case of failure of any intermediate node in the

tree hierarchy during operation will lead to the loss of data.

Though E-BINA requires all nodes to adjust their transmission power after the

deployment and requires extra effort for programming before, it conserves a significant

amount of energy. So in the presence of the above issue, E-BINA outperforms when we

try to maximize the network lifetime.

3.5 Summary

E-BINA takes the advantageous features of cluster-based and tree-based

approaches.

Instead of sending data directly to cluster-head, nodes form an aggregation tree in

each cluster.

In aggregation tree, the selection of parent is based on two factors: hop distance

and residual energy of node.

Energy model is based on first order radio model.



Protocol requires all nodes to adjust their transmission power.

Protocol defines two types of packets: configuration packet and data packet.

Each node eliminates redundancy in the data by satisfying equation (3).

In-network aggregation is performed during data flow towards cluster-head.

Issue: Adjustment of transmission power after deployment requires extra

programming effort.



Chapter 4

Performance Analysis

In previous chapter we have explained our proposed protocol for data aggregation

in cluster-based wireless sensor networks, called E-BINA. Now in this chapter we are

going to analyses the performance of our protocol. We will show how our protocol

outperforms in terms of energy efficiency in comparison to conventional protocol for data

aggregation in cluster-based wireless sensor networks by presenting simulation analysis

with different simulation parameters taken and results obtained.

4.1 Simulation Analysis

Though many simulation tools are available for wireless sensor networks as

discussed in chapter 2, we have chosen Network Simualtor-2 (NS-2) [20], in particular

NS-2.29.3, as our tool to simulate the proposed protocol.

4.1.1 Simulation Setup

A square field of 160m X 160m is taken where 11 nodes are randomly deployed.

One node is designated as cluster-head (CH) and one node is designated data

source.

Command:

set val(sc) "/root/Desktop/mov1"

M.Tech. – Thesis (July 2008) Indian Institute of Information Technology, Allahabad 28


set val(cp) ""

#setdest is found in "/root/ns-allinone-2.29/ns-2.29/indep-

utils/cmu-#scen-gen/setdest"

exec ./setdest -n 10 -M 0.1 -p 21 -x 160 -y 160 -t 20 > mov1 &

puts "loadin RANDom sceanrio"

source $val(sc)

The snapshot of node scenario in NAM is shown below. The three colors of node

show the energy levels of nodes. The initial color of nodes is green. When energy drop to

first threshold level the color turns to yellow and when drop to second threshold level the

color turns to red. After this level a node is dead and color is red.

Figure 4.1 Random nodes scenario in NAM.



Energy model is ON. Transmit power, Receive power, Idle power, Sleep power,

Transition power, and Initial Energy of nodes is set accordingly. Also

transmission range is set by controlling the transmit power and receiving

threshold of antenna of nodes. All other parameters are taken default values.

Command:

set val(energymodel) EnergyModel ;# energy model

is on

set val(initialenergy) 1 ;# Initial energy

in Joules

set val(sleeppower) 0.0 ;# sleep power in

Watt

set val(tp) 0.002 ;# transition

power

consumption(Watt)

in state

transition from

sleep to idle

(active)

set val(tt) 0.005 ;# transition

time(second) use

instate

transition from

sleep to idle

(active)

set val(ip) 0.035 ;# idle power



Transmit power, Receive power, transmit power & receiving threshold of antenna

of nodes are set with different values to use different transmission range of nodes and

show the comparison between E-BINA and conventional protocol.

Case 1: E-BINA

set val(rxPower) 0.395 ;# receive

power in Watt

set val(txPower) 0.66 ;# transmit

power in Watt

Phy/WirelessPhy set Pt_ 8.5872e-4 ;# 40m

Phy/WirelessPhy set RXThresh_ 3.66152e-10

Case 2: Conventional protocol

set val(rxPower) 1.0 ;# receieve

power in Watt

set val(txPower) 2.0 ;# transmit

power in Watt

Phy/WirelessPhy set Pt_ 7.214e-3 ;# 100m

Phy/WirelessPhy set RXThresh_ 3.65209e-10

Other values that could be used are:

#Phy/WirelessPhy set Pt_ 1.33826e-3 ;# Transmission

range 50m,

#Phy/WirelessPhy set Pt_ 0.281838 ;# 250m

The value of RXThresh_ is obtained by executing threshold.cc defined in "/root/ns-

allinone-2.29/ns-2.29/indep-utils/propogation”. Snapshot is given below:



Figure 4.2 Snapshot of console

Some other parameters used are:

Channel Type Wireless channel

Propagation Model Two Ray Ground

MAC Type 802.11

Network Interface Type Phy/WirelessPhy

Interface Queue Type Queue/DropTail/PriQueue

Antenna Model Antenna/OmniAntenna

Routing Protocol AODV

Simulation Time 20 sec

Parameters set for data transfer are:

Cluster-head node 10 with UDP agent attached

Source node node 0 with UDP agent attached

Traffic Type CBR with a rate of 5 packets / second

Packet Size 136 bytes



4.1.2 Simulation Run

Case 1: E-BINA

For E-BINA we set transmission range of 40m such that a node sends its data to its

single-hop neighbor and data is forwarded in a multi-hop fashion. Figure 4.3 shows the

data transfer between node 0 and node 10. Node 1 and 2 are relay nodes. Since the

transmission range is set to 40m, node 0 can only send its data to node 1 and so other

nodes.

Figure 4.3 Data transfer between node 0 and 10 through node 1 & 2.



Figure 4.4 shows that after some time energy level of node 2, 1, 0, and 10 dropped to first

threshold level and hence the color of nodes turns to yellow.

Figure 4.4 Energy level of nodes drop to first threshold.

Figure 4.5 shows that after some more time energy level of node 1, 2, 0, and 10 dropped

to second threshold level and hence the color of nodes turns to red.



Figure 4.5 Energy level of nodes drop to second threshold.

Case 2: Conventional Protocol

In conventional method, all nodes in a cluster send their data directly to cluster-head. For

this reason we set transmission range of nodes to be 100m so that source node 0 can send

data directly to node 10. To able to have a large transmission range the transmitting and

receiving power of nodes are more than double of as in case 1.

The data transfer start between node 0 and node 10 directly. As happened in last case,

again after some time energy level of node 0 and node 10 decreased to first threshold

level and color of nodes change from green to yellow and then energy level of both nodes

go down to second threshold level and nodes turn to red.



4.1.3 Simulation Results

A. Conserving Energy

We determine residual energy of the source node, which is defined as the remaining

energy of a node and considered that as the metric to prove energy efficiency of our

proposed protocol. We used this metric to show the impact of transmission power on

energy reduction. Figure 4.9 shows the significant reduction in energy consumption by

using E-BINA when compared with conventional protocol. This shows the benefit of

sending data in a multi-hop fashion towards cluster-head.

E-BINA Conventional protocol

Figure 4.6 Residual energy of source as a function of time.



B. Throughput

We have also measured the throughput of the receiving node i.e. cluster-head node 10 in

our scenario for both the cases. Throughput of a node is defined as the average rate of

successful message delivery over a communication channel. Figure 4.10 show that E-

BINA achieves high throughput in comparison with conventional protocol.

E-BINA Conventional protocol

Figure 4.7 Throughput as function of time.



C. Network Density

By considering the changes in the network density, we also study the relationship

between the network lifetime and network density. In our experiment we have considered

the change in the residual energy of source node i.e. node 0 in the end of simulation. The

density of network is calculated via equation [9]:

λ = NπR2 / A2

Where, N is sensor number,

R is sensor range,

A is sensor area.

By keeping network area constant and increasing the number of nodes, we have increased

network density. Due to increase in the network density, the hop count between source

node and sink node also increases. When hop count increases node now transmit data to

nearer node with less transmit power and hence consume less energy. Figure 4.8 shows

the increase in the residual energy when we increase the hop count. We have taken N as

11, 21, 31, 41, and 51.

Effect of Network Density

00.050.1

0.150.2

0.250.3

0.350.4

3 4 5 6 7

Hop Count

Resi

dual

Ene

rgy

Figure 4.8 Effect of network density



D. Packet Delivery Ratio

Besides examining the network lifetime extension roughly via energy saving, we also

evaluate the network efficiency influenced by E-BINA. Here, we measure the efficiency

in term of data delivery ratio, which is defined as the number of received packets divided

by the number of sent packets for a certain time period. From our simulation results

illustrated in figure 4.9, we find that this ratio does not change much while the network is

alive. It shows the stable performance of our protocol. When the network energy is

running out, the data delivery ratio collapses rapidly. This phenomenon probably can be

taken as a sign of the network death.

PDR

00.10.20.30.40.50.60.70.80.9

1

5 10 15 20

simulation time

pack

et d

eliv

ery

ratio

Figure 4.9 Packet delivery ratio of the network.



Chapter 5

Conclusion & Future Work

Wireless sensor networks are energy constrained network. Since most of the

energy consumed for transmitting and receiving data, the process of data aggregation

becomes an important issue and optimization is needed. Efficient data aggregation

protocols not only provide energy conservation but also remove redundancy in the data

and hence provide useful data only. There exist several protocols for data aggregation

which uses different approaches to provide energy efficiency. In cluster-based

approaches, nodes send their data directly to cluster-head and cluster-head then aggregate

and forward the data towards sink. We exploited this approach and proposed a new

protocol called Energy-aware Balanced In-Network Aggregation (E-BINA).

E-BINA uses the advantageous features of cluster-based and tree-based

approaches. E-BINA requires a wireless sensor network which is divided in to several

clusters, each having a cluster-head. Each cluster then uses E-BINA independently and

avoids aggregation only at cluster-head by constructing an aggregation tree rooted at

cluster-head. During the construction of aggregation tree, each cluster member node

chooses its parent node among its neighbors based on the information of residual energy

and hop distance from cluster-head. After the construction of aggregation tree, when a

parent node receives data from its different child neighbor nodes, it eliminates the

redundancy in the data received from different nodes and then forward. The difference

between E-BINA and other cluster-based approach lie in the reduction of transmission

power of node as in E-BINA a node send data to its neighbor node instead of sending to

cluster-head. This is the main performance improvement factor of our protocol.

M.Tech - Thesis (July 2008) Indian Institute of Information Technology, Allahabad 40


The simulation result shows that when the data from source node is send to

cluster-head through neighbors nodes in a multi-hop fashion by reducing transmission

and receiving power, the energy consumption is low as compared to that of sending data

directly to cluster-head.

Future work will focus on the implementation of E-BINA in NS-2 as a separate

module so that it could be tested more accurately. As we have already tested the effect of

reduction of transmission power on the energy consumption and we got positive result.

After implementing in NS-2, we will measure the whole network lifetime, packet

delivery ratio and effect of network density. Also the effect of redundancy factor on

energy consumption and overall performance of our protocol will be measured.

Enhancing E-BINA by introducing an effective compression technique for data is also the

part of future work.

Apart from working on E-BINA, future work also includes the extension of NS-2

for wireless sensor networks. Though some extensions are available but none of them

work properly and efficiently. Developing a framework for wireless sensor network in

NS-2 helps us to test our protocol more accurately; other protocol as well. It will provide

user a more realistic model for simulating a scenario.

M.Tech - Thesis (July 2008) Indian Institute of Information Technology, Allahabad 41


Appendix A

The Network Simulator version 2 (NS-2) is a deterministic discrete event network

simulator, initiated at the Lawrence Berkeley National Laboratory (LBNL) through the

DARPA funded Virtual InterNetwork Testbed (VINT) project. The VINT project is

collaboration between the Information Sciences Institute (ISI) at the University of

Southern California (USC), Xerox's Palo Alto Research Center (Xerox PARC),

University of California at Berkeley (UCB) and LBNL [20].

NS-2 was initially created in 1989 as an alternative to the REAL Network

Simulator. Since then there is significant growth in uses and width of NS project.

Although there are several different network simulators available today, ns-2 is one of the

most common. NS-2 differs from most of the others by being open source software,

supplying the source code for free to anyone that wants it. Whereas most commercial

network simulators will offer support and a guarantee but keeping the moneymaking

source code for themselves. The release of the source code helps users to create their own

functions and subprograms, but also makes it easier to implement them into the ns-2

environment. One of the main benefits for the ns project group releasing the source code

is that independent researchers can help in the development of ns-2. It is fairly common

that a researcher contributes with the code of a non-implemented protocol or algorithm,

after constructing it for his studies.

It should be noted that NS-2 is a research progressive effort and not a kind of

commercial software release. The difference is that there are very few people in the ns

project group compared to ordinary software, leading to difficulties in supporting all the

users. That problem has lead to the solution of having a huge mailing list

(http://mailman.isi.edu/mailman/listinfo/ns-users) for anyone interested, as well as a

complete archive of all the mails ever been sent to this mailing list. The mailing list is

based on the idea of user helping user, taking the load of the ns project group. The



mailing list and the archives are a huge help for all users of ns-2, no matter old or new,

since usually someone else has had the same problem before.

Another important thing to note is that NS-2 is an ongoing progressive project and

hence can not be considered as a complete product. This is the reason why it is free of

cost and only offers the mailing lists as support. The people that are in charge of the

project heavily rely on the users to find bugs and faults and reporting these when

discovered. This also leaves the validating of results to the user, but the user is not alone

so help is just an email away. The most commonly used protocols are so well

implemented and checked so the main concerns are the new implementations. New

implementations usually start out as a research assignment not linked to the ns project

group. Since the project group does not have a full company helping them in verification

and implementation they have no possibility to do everything themselves thus

encouraging any help they can get.

A 1 The NS-2 structure

NS-2 is made up of hundreds of smaller programs, separated to help the user sort

through and find what he or she is looking for. Every separate protocol, as well as

variations of the same, sometimes has separate files. Though some are simple, but still

dependent on the parental class [20].

Figure A.1 The basic structure of NS-2



A 1.1 C++

C++ is the predominant programming language in ns-2. It is the language used for

all the small programs that make up the ns-2 hierarchy. C++, being one of the most

common programming languages and specially designed for object- oriented coding, was

therefore a logical choice what language to be used. This helps when the user wants to

either understand the code or do some alterations to the code. There are several books

about C++ and hundreds, if not thousands, of pages on the Internet about C++

simplifying the search for help or answers concerning the ns-2 code.

A 1.2 OTcl

Object Tcl (OTcl) is object-oriented version of the command and syntax driven

programming language Tool Command Language (Tcl). This is the second of the two

programming languages that NS-2 uses. The front-end interpreter in NS-2 is OTcl which

link the script type language of Tcl to the C++ backbone of NS-2. Together these two

different languages create a script controlled C++ environment. This helps when creating

a simulation, simply writing a script that will be carried out when running the simulation.

These scripts will be the formula for a simulation and is needed for setting the

specifications of the simulation itself. Without a script properly defining a network

topology as well as the data-rows, both type and location, nothing will happen. For a

more in depth presentation of these scripts, it is better to have a closer look at the

introduction and related chapters in the NS-2 manual.

A 1.3 Nodes

A node is exactly what it sounds like, a node in the network. A node can be either

an end connection or an intermediate point in the network. All agents and links must be

connected to a node to work. There are also different kinds of nodes based on the kind of

network that is to be simulated. The main types are node and mobile node, where node is

used in most wired networks and the mobile node for wireless networks. There are



several different commands for setting the node protocols to be used, for instance what

kind of routing is to be used or if there is a desire to specify a route that differs from the

shortest one. Most of the commands for node and mobile node can be easily found in the

ns documentation. Nodes and the closely connected link creating commands, like simplex

link and duplex link, could be considered to simulate the behavior of both the Link Layer.

A 1.4 Agents

An agent is the collective name for most of the protocols you can find in the

transport layer. In the ns-2 documentation agents are defined as the endpoints where

packets are created and consumed. All the agents defined in ns-2, like tcp, udp etc., are

all connected to their parent class, simply called Agent. This is where their general

behavior is set and the offspring classes are mostly based on some alterations to the

inherent functions in the parent class. The modified functions will overwrite the old and

thereby change the performance in order to simulate the wanted protocol.

A 1.5 Applications

The applications in ns-2 are related to the Application Layer in the TCP/IP suite.

The hierarchy here works in the similar way as in the agents case. To simulate some of

the most important higher functions in network communication, the ns-2 applications are

used. Since the purpose of ns-2 is not to simulate software, the applications only

represent some different aspects of the higher functions. Only a few of the higher layer

protocols has been implemented, since some are quite similar when it comes to using the

lower functions of the TCP/IP stack. For instance there is no use adding both a SMTP

and a HTTP application since they both use TCP to transfer small amounts of data in a

similar way. The only applications incorporated in the release version of ns-2 are a

number of different traffic generators for use with UDP and telnet and FTP for using

TCP. All the applications are script controlled and when concerning the traffic

generators, you set the interval and packet-size of the traffic. FTP can be requested to

send a data packet whenever the user wants to, or to start a transfer of a file of an



arbitrary size. If starting an FTP transmission and not setting a file-size the transmission

will go on until someone calls a stop.

A 1.6 NAM

The Network Animator NAM is a graphic tool to use with ns-2. It requires a nam-

tracefile recorded during the simulation and will then show a visual representation of the

simulation. This will give the user the possibility to view the traffic packet by packet as

they move along the different links in the network. NAM offers the possibility of tracing

a single packet during its travel and the possibility to move the nodes around for a user to

draw up his network topology according to his own wishes. Since the simulation has

already been performed there is no possibility for the user to change the links or any other

aspect of the simulation except the representation. The existence of an X-server allows

NAM to be able to open a graphical window. Therefore if NAM is to work, there must be

a version of X-server running.



REFERENCES

[1] W. Choi, P. Shah, and S. K. Das, “A Framework for Energy-Saving Data Gathering

Using Two-Phase Clustering in Wireless Sensor Networks”, in Proceedings of the

International Conference on Mobile and Ubiquitous Systems: Networking and

Services (MobiQuitous), Boston, 2004, pp. 203-212.

[2] M. Lee, and S. Lee, “Data Dissemination for Wireless Sensor Networks”, in

Proceedings of the 10th IEEE International Symposium on Object and Component-

Oriented Real-Time Distributed Computing (ISORC’07).

[3] C. Intanagonwiwat, R. Govindan, D. Estrin, J. Heidemann, and F. Silva, “Directed

Diffusion for Wireless Sensor Networking”, IEEE/ACM Transactions on Networking,

Vol. 11, no. 1, Feb 2003.

[4] H. Cam, S. Ozdemir, P. Nair, and D. Muthuavinashiappan, “ESPDA: Energy-

Efficient and Secure Pattern-based Data Aggregation for Wireless Sensor Networks”,

in Proceedings of IEEE Sensor- The Second IEEE Conference on Sensors, Toronto,

Canada, Oct. 22-24, 2003, pp. 732-736.

[5] L. Gatani, G. Lo Re, and M. Ortolani, “Robust and Efficient Data Gathering for

Wireless Sensor Networks”, in Proceeding of the 39th Hawaii International

Conference on System Sciences – 2006.

[6] K. Dasgupta, K. Kalpakis, and P. Namjoshi, “An Efficient Clustering-based Heuristic

for Data Gathering and Aggregation in Sensor Networks”, IEEE 2003.

M.Tech.- Thesis (July 2008) Indian Institute of Information Technology, Allahabad 47


[7] E. Fasolo, M. Rossi, J. Widmer, and M. Zorzi, “In-Network Aggregation Techniques

for Wireless Sensor Networks: A Survey”, IEEE Wireless communication 2007.

[8] M. Lee and V.W.S. Wong, “An Energy-aware Spanning Tree Algorithm for Data

Aggregation in Wireless Sensor Networks,” IEEE PacRrim 2005, Victoria, BC,

Canada, Aug. 2005.

[9] M. Ding, X. Cheng, and G. Xue, “Aggregation tree construction in sensor networks,”

in Proc. of IEEE VTC’03, Vol. 4, Orlando, FL, Oct. 2003.

[10] “The Design Space of Wireless Sensor Networks” by Kay R¨omer and Friedemann

Mattern

http://www.vs.inf.ethz.ch/publ/papers/wsn-designspace.pdf

[11] http://en.wikipedia.org/wiki/Wireless_Sensor_Networks

[12] “Handbook of Sensor Networks: Compact Wireless and Wired Sensing Systems” by

Mohammad Ilyas and Imad Mahgoub.

[13] “Ad-hoc & Sensor Networks” by Carlos de Morais Cordeiro and Dharma Prakash

Agrawal.

[14] S. Madden et al., “TAG: a Tiny Aggregation Service for Ad-hoc Sensor Networks,”

OSDI 2002, Boston, MA, Dec. 2002.

[15] S. Nath et al., “Synopsis Diffusion for Robust Aggregation in Sensor Networks,”

ACM SenSys 2004, Baltimore, MD, Nov. 2004.

[16] K. Dasgupta et al., “Maximum Lifetime Data Gathering and Aggregation in

Wireless Sensor Networks”, In Proc. of IEEE Networks’02 Conference, 2002.


http://www.vs.inf.ethz.ch/publ/papers/wsn-designspace.pdf

http://en.wikipedia.org/wiki/Wireless_Sensor_Networks


[17] W. Heinzelman, A.P. Chandrakasan, and H. Balakrishnan, “Energy-Efficient

Communication Protocols for Wireless Microsensor Networks”, In Proc. Of

Hawaiian Intl. Conference on System Science, 2000.

[18] NRL’s Sensor Network Extension to ns-2. http://nrlsensorsim.pf.itd.nrl.navy.mil/

[19] David curren, “A survey of Simulation in Sensor Networks”.

www.cs.binghamton.edu/~kang/teaching/cs580s/david.pdf

[20] The Network Simulator NS-2, http://www.isi.edu/nsnam/ns/ , January 2008.


http://nrlsensorsim.pf.itd.nrl.navy.mil/

http://www.cs.binghamton.edu/%7Ekang/teaching/cs580s/david.pdf

http://www.isi.edu/nsnam/ns/

Data Aggregation in Cluster-based Wireless Sensor...

Documents

Transcript of Data Aggregation in Cluster-based Wireless Sensor...