A Tree-Cluster-Based Data-Gathering Algorithm for Industrial WSNs ...

SPECIAL SECTION ON INDUSTRIAL SENSOR NETWORKS WITH ADVANCEDDATA MANAGEMENT: DESIGN AND SECURITY

Received March 30, 2015, accepted April 5, 2015, date of publication April 22, 2015, date of current version May 5, 2015.

Digital Object Identifier 10.1109/ACCESS.2015.2424452

A Tree-Cluster-Based Data-Gathering Algorithmfor Industrial WSNs With a Mobile SinkCHUAN ZHU1, SHUAI WU1, GUANGJIE HAN1, (Member, IEEE), LEI SHU2, (Member, IEEE),AND HONGYI WU3, (Member, IEEE)1Department of Information and Communication Systems, Hohai University, Changzhou 213022, China2Guangdong Petrochemical Equipment Fault Diagnosis Key Laboratory, Guangdong University of Petrochemical Technology, Maoming 525000, China3University of Louisiana at Lafayette, Lafayette, LA 70504, USA

Corresponding author: G. Han ([email protected])

This work was supported in part by the Qing Lan Project, in part by the 2013 Special Fund of Guangdong Higher School TalentRecruitment, in part by the Educational Commission of Guangdong Province, China, under Project 2013KJCX0131, in part by theGuangdong High-Tech Development Fund under Grant 2013B010401035, in part by the 2013 Top Level Talents Project in SailingPlan of Guangdong Province, in part by the National Natural Science Foundation of China under Grant 61401107, and in part bythe 2014 Guangdong Province Outstanding Young Professor Project.

ABSTRACT Wireless sensor networks (WSNs) have been widely applied in various industrial applications,which involve collecting a massive amount of heterogeneous sensory data. However, most of thedata-gathering strategies for WSNs cannot avoid the hotspot problem in local or whole deployment area.Hotspot problem affects the network connectivity and decreases the network lifetime. Hence, we propose atree-cluster-based data-gathering algorithm (TCBDGA) forWSNs with a mobile sink. A novel weight-basedtree-construction method is introduced. The root nodes of the constructed trees are defined as rendezvouspoints (RPs). Additionally, some special nodes called subrendezvous points (SRPs) are selected accordingto their traffic load and hops to root nodes. RPs and SRPs are viewed as stop points of the mobile sinkfor data collection, and can be reselected after a certain period. The simulation and comparison with otheralgorithms show that our TCBDGA can significantly balance the load of the whole network, reduce theenergy consumption, alleviate the hotspot problem, and prolong the network lifetime.

INDEX TERMS Data-gathering scheme, cluster, mobile sink, wireless sensor networks.

I. INTRODUCTIONWireless sensor networks (WSNs) have been widely appliedin various industrial applications, e.g., patient monitoring anddisease analysis, equipment monitoring and fault prediction,pollution monitoring and source detection, sea searching andtide monitoring. A WSN consists of a large number of tinysensor nodes with low power, limited storage and transmis-sion abilities [1]. Lots of energy of a sensor is consumed bycommunication phase in data gathering process, especiallyfor a massive amount of heterogeneous sensory data. Also,in multi-hop communication, sensor nodes that near the sinktend to consume more energy as they are responsible forreceiving and forwarding data from the whole network.It leads to a non-uniform energy consumption amongnodes, i.e. the ‘‘hotspot problem’’. To alleviate the‘‘hotspot problem’’ and balance the energy consumption,mobile elements (MEs) such as mobile sink and mobile datacollector have been adopted for data gathering. However,for most of the data gathering algorithms, the data gatheringlatency is inevitable if the MEs perform data collecting viashort range communications. On the contrary, if sensor nodesforward their data to the MEs via multi-hop transmissions,

due to the overhead of control messages and communications,the energy consumption will be increased, which will largelyreduce the network lifetime. Thus, it is necessary to balancethe tradeoff between the relay hops of multi-hop communica-tions and the tour length of the ME, which implies to achievea long network lifetime with short data gathering latency.

To achieve this objective, many efficient mobile datagathering algorithms have been proposed. From the point ofthe view of consisting elements, these mobile data gatheringalgorithms can be divided into two categories:• Heterogeneous Sensor Network: It consists of staticsensor nodes, advanced nodes, mobile elements and astatic base station (BS) [2]. The advanced nodes alwayshave more resources than other nodes, so that they canact as rendezvous points (RPs) for data aggregation.Also the number and position of these advanced nodesmust be constrained to limit the tour length of MEs,which in turn shorten the data gathering latency.

• Homogeneous Sensor Network: Some sensor nodes withshorter distance to the BS or mobile sink trajectory arerecruited as polling points for the MEs to visit. Thiscategory of algorithms always limit the relay hops to a

VOLUME 3, 20152169-3536 2015 IEEE. Translations and content mining are permitted for academic research only.

Personal use is also permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

381

C. Zhu et al.: TCBDGA for Industrial WSNs With a Mobile Sink

small value to reduce the energy consumption ofmulti-hop transmissions.

However, to further investigate the characteristic andperformance of these mobile data gathering algorithms, wedivide the related literatures into two categories. The firstcategory is a polling-based or rendezvous point-basedapproach, in which the mobile elements visit each of theselected polling points or rendezvous points, and collect thedata of their associated sensors via short range communica-tions [3]–[7]. In this category, when selecting the suitablepause locations for the mobile elements, the energyconsumption and data gathering latency should be jointlyconsidered. The second category adopts some existed cluster-ing algorithms to select proper cluster heads (CHs) for dataaggregation and uploading [8]–[13]. In this category, sensornodes are organized into clusters and form the lower layer ofthe network. Cluster members upload their data to the CHsand the mobile elements collect data from the CHs at thehigher layer. The cluster members and the CHs construct abackbone structure in the network.

In this paper, we propose a tree-cluster based data gatheringalgorithm with mobile sink, which is called TCBDGA forabbreviation. Contributions of this paper can be summarizedas follows:

1) Weight-Based Tree Construction: It takes residualenergy, the number of neighbors, and the distance tothe BS into consideration.

2) Tree Decomposition and Sub-Rendezvous PointsSelection: It constrains the local data collection scope.

3) RPs and SRPs Re-selection: It is performed by treereconstructions after certain data collection rounds, andmakes the energy consumption balanced.

The rest of this paper is organized as follows. First,recent research works are investigated in Section II. Then,in Section III, the network and sensor energy consumptionmodel are described. And, in Section IV, we introduce anddiscuss the core algorithm TCBDGA, which includes threestage, tree construction, RPs and SRPs selection, and datacollection. The simulation results and performance evaluationare given in Section V. And finally, Section VI concludesthe paper.

II. RELATED WORKA. OVERVIEWThe method of using mobile elements to gather data hasdrawn widely attention in recent years. In this section, webriefly outline some related typical literatures on mobile datagathering in WSNs, and group them into the following twocategories:

1) Direct Communicating Collection, one or more MEsvisit each sensor node and collect its data only viasingle-hop communication;

2) Multi-hop Communicating Collection, the MEs onlyvisit a subset of sensor nodes or some designatedpositions, and other nodes need to send their data viaa multi-hop path.

In the category of multi-hop communicating collection,MEs may visit only a set of locations, sensor nodes orsome areas of the WSN and sensor nodes send their data toME through multi-hop communication. Some related issues,such as schedule pattern, load balance, and data deliverylatency, are also jointly considered along with routing tofurther improve energy efficiency. Therefore, Multi-hopcommunicating collection can further be divided into twosub-categories based on the amount of data needed to becollected:• Local Data Collection, only interest data from sourcenodes need to be collected;

• Full Data Collection, data of the whole network neededto be collected.

To make our categories more clearly and well understood,we describe the related literatures in the following threeclassifications: direct communicating collection, local datacollection, and full data collection. And finally, we give asummary at the end of this section.

B. DIRECT COMMUNICATING COLLECTIONExisting algorithms falling into this category always takeadvantage of the ME to gather data from a homogeneousnetwork when it moves into the communication range ofeach sensor node. In general, only single-hop communica-tion is required in these algorithms, which can achieve thelongest network lifetime by minimizing the communicationoverhead on data gathering. However, the maximum energyuse efficiency is achieved at the expense of extremely longdelivery latency. Therefore, some of them just adopt somemethods to compute a shortest path for the ME such thatthe data gathering latency can be reduced to a certaindegree [15]. Additionally, others employ multiple MEs tofurther minimize the latency and the buffer overflow [16].

In [15], a data mule is exploited to collect data from allsensor nodes when it is in their communication range. Themain objective of this protocol is to minimize data deliverylatency by solving the DataMule Scheduling (DMS) problemand achieve a shortest trajectory for the data mule underthe location and time constraints. In [16], the target field isdivided into several non-overlapped zones by a novel partitionalgorithm. To balance the data gathering latency, in each zonea mobile sink is adopted to visit every sensor node of thatzone for data collection. The shuffled frog leaping algorithmis employed in the algorithm to calculate the optimal solutionfor the TSP and to reduce the traveling time of each mobilesink as well.

Existing protocols in this type only can be applied to somedelay-tolerant applications. They are impractical when thereare a large number of sensor nodes. Somany researchers favormulti-hop communication for effective data gathering.

C. LOCAL DATA COLLECTIONTo collect interest data from source nodes, some agent-basedprotocols employ one or more agents to track the latest loca-tion of the ME and then construct a optimal routing path from

382 VOLUME 3, 2015


the source node to the ME for data forwarding viamulti-hop communication [17]–[19]. Other approachessupporting query-driven, data-driven or both, would like toadopt some methods to establish the shortest data reportingroutes and also reduce control overhead [20]–[22]. However,the hotspot problem still exists in these protocols. Also,the sink mobility can cause unexpected changes of networktopology and it may bring excessive protocol overhead forroute maintenance.

1) AGENT BASED APPROACHESTo enable efficient dissemination in large scale wirelesssensor networks with sink mobility, an Intelligent Agent-based Routing (IAR) protocol is proposed in [17]. The closestnode of the mobile sink is defined as an immediate relaynode (IR). The IR is responsible for transmitting the relaypath setup message (RPSP) to the agent along the shortestpath. Then, data packets are routed along the reverse pathof the RPSP. Moreover, the IR decides whether they mustselect a new path or existing path from the elected immediaterelay to sources by distance comparison algorithm. The IARcan reduce the delay, energy consumption and the controloverhead for data delivery.

To reduce the overhead of frequent location updates of themobile sink, the authors in [18] adopt multiple agents that arelocated on the path between the source node and the mobilesink for data transmission. In the proposed DynamicMulti-Agent-based Local Update Protocol (DMALUP),agents can be changed based on the location of the mobilesink so as to support the approximate shortest path. Only thelast agent needs to update the latest location information ofthe mobile sink, which decreases the number of overheads.Further, the agent is dynamically changed and the proxi-mately shortest route is created, and hence, the delay in datapacket transmission is the minimum.

To address sink mobility and save energy, a multiplemobile sinks data dissemination mechanism (MSDD) isintroduced in [19], which supports both event-driven andquery-driven application. It exploits a hierarchical monitoringmechanism based on a two-tier grid structure that constructedby a master sink. To make the detected events be reported tothe appropriate sinks immediately, MSDD employs a globalagent (GA) to maintain each sink’s location. It is proved thatMSDD can decrease the communication overhead.

2) QUERY AND DATA-DRIVEN APPROACHESThe authors of [20] take predictable mobility of the sinks intoconsideration and propose a Predictable Mobility-based DataDissemination protocol (PMDD) for data collection. A virtualgrid structure is constructed, and only grid heads need toforward the query of sink and data of source nodes. Sensornodes in PMDD know future locations to which the sink willmove, so that they can forward their data directly to the sinkviamulti-hop communications. The authors has validated thatPMDD can decrease the energy consumption and increase thedata delivery ratio efficiently.

In [21], an efficient data-driven routing protocol (DDRP)is presented. It exploits the broadcast feature of wirelessmedium for gratuitous route learning, and combines data-driven packet forwarding with random walk routing for dataforwarding path construction. DDRP can reduce the protocoloverhead for route discovery and maintenance caused by sinkmobility while keeping high packet delivery performance.SinkTrail [22], an energy efficient proactive data reportingprotocols, is proposed for mobile sink-based datacollection. SinkTrail establishes a logical coordinate systemfor routing and forwarding data packets. Sensor nodes updatetheir coordinates corresponding to the trail messages andthe sojourn positions of the mobile sink, and then selectnext hop with the shortest logical distance to the mobilesink to establish data reporting routes. The proposedprotocol feature low-complexity and reduced controloverheads.

D. FULL DATA COLLECTIONIn this category, recent protocols always introduce hierarchi-cal structure to a homogeneous or heterogeneous WSN andselect some positions, sensor nodes or areas to visit for datacollection. To achieve a balance between energy consumptionand data gathering latency, they should constraint the hopdistance of the data relaying path and the tour lengthof the MEs to suitable values. These protocols can bedivided into three categories just as described in [23]: PP/RPbased approaches, cluster based approaches, and area basedapproaches.

1) PP/RP BASED APPROACHESIn this sub-category, they firstly select or predefine somepolling points (PP) or rendezvous points (RP) for theMEs to visit, which bounds the tour length. Then othersensor nodes need to affiliate themselves to the properPP/RP and send their data to the MEs via short rangecommunications [3]–[7], [9], [11], [13]. When selecting suit-able pause locations or sensor nodes for the MEs to visit,the energy consumption and data gathering latency should bejointly considered.

Two polling-based data gathering approaches are proposedin [3], which is called SPT-DGA and PB-PSA, respectively.The centralized algorithm SPT-DGA is to iteratively findsome proper sensor nodes as PPs on a shortest path tree (SPT).The selected PPs need to link as many as possible sensornodes with the limited relay hop bound, so as to obtain theminimum number of PPs for mobile collectors to visit andperform data gathering. The basic idea of the distributedalgorithm PB-PSA is that sensor nodes need firstly know thenumber of its neighbors and the distance to the data sink, andthen iteratively select the best set of PPs by comparing thetwo parameters. In this way, both of the two algorithms canhave the advantage of the minimum number of PPs, whichlead to a short tour length of the mobile element. But neitherof them has considered the factor of residual energy of eachnode when selecting PPs, and the selected PPs always deplete

VOLUME 3, 2015 383


their energy faster than other nodes after some data gatheringrounds.

The authors of [5] take advantage of SDMA techniquewhen designing data gathering protocol such that compatiblesensors can upload their data to the SenCar concurrently. Forthe single SenCar case, their objective is to find theminimum set of polling points from the candidates for obtain-ing the tradeoff between the shortest moving tour and thefull utilization of SDMA. While for the multiple SenCarscase, a region-division and tourplanning (RDTP) algorithmis proposed in which the data gathering time is uniformlydistributed among different regions by decompose the treebased on the weight of data uploading and moving time.All the proposed algorithms can successfully achieve a fullcoverage of network with the shortest data gathering latency,and make the network load balanced. However, this paperdoes not take the energy consumption and hotspot probleminto consideration.

In [4], an M-collector is used to visit each of the selectedPPs for data gathering. They mainly focus on the problem ofminimizing the tour length of theM-collector and refer to it asa single-hop data gathering problem (SHDGP) when select-ing PPs. What’s more, when consider multiple M-collectors,they adopt the spanning tree covering algorithm to select theminimum number of PPs and find the shortest sub-tours ofeachM-collector for data gathering. This scheme can achieveshort data gathering latency since it can obtain the minimumnumber of PPs. However, it does not consider the energy ofeach node, as well as the hotspot problem around the PPs.

A weighted rendezvous planning (WRP) method [6] isproposed to select suitable sensor nodes as RPs for the mobilesink to visit. In WRP, each of the nodes has a weight jointlyconsidering the hop count and the number of forwardedpackets. Therefore, sensor nodes farther away from theselected RPs or with more than one packet in their buffer havea higher priority of being recruited as RPs. Hence, visiting thefinal selected RPs for data gathering will reduce the numberof multi-hop transmissions, and thereby reduce the energyconsumption. But each time after selecting a new RP, thenetwork needs to recalculate the weight of the rest of nodes,which will cause a large control overhead.

In [7], a concentric circular trajectories model with thesame distance of each ring is used. In each sub-trajectory,a number of Waypoint nodes are distributed uniformly for theMobile Data Harvester (MDH) to visit, in order to achievethe desired full coverage. Also, the one-hop neighbors of theWaypoint nodes are chosen as the Designated Gateways (DG)for data buffering, which means that other nodes only needto upload their data to these DGs rather than the Waypointnodes. Then the MDH collects data from these DGs whenarrives at these Waypoint nodes. This method alleviates theload ofWaypoint nodes, but it does not consider the energy ofeach node and the nodes density, which may lead to a hotspotproblem around the Waypoint nodes.

To alleviate the sink-hole problem, the authors of [11]make the mobile sinks move on a predetermined path,

along the perimeter of a hexagonal tiling, and stop at multiplelocations for data gathering. Meanwhile, they also propose adistributed algorithm used by the sinks to decide their nextmoving location such that the virtual backbone formed by themobile sinks remains interconnected at all times. At the endof each data gathering round, a sink decides whether or not tomove to a new location, depending on the energy levels of itsone-hop sensor neighbors. But the mobile sink may stay at alocation for a long time, which leads to a long data gatheringlatency.

The authors of [13] propose a moving strategy calledenergy-aware sink relocation (EASR) for mobile sink toalleviate the hotspot problem. In this strategy, the transmis-sion range of sensor nodes and the moving strategy of themobile sinks are adaptively adjusted according to the residualenergy of them. The disadvantage of this algorithm is thatthe control overhead for changing the transmission range ofnodes may be high, and the mobile sinks need frequently tocheck the residual energy of the sensor nodes and whether themoving condition is satisfied, which also incur large networkenergy consumption.

In [9], the authors consider a heterogeneous WSN whichconsists of a large number of sensor nodes, a few mobiledata collectors (DCs), and a static BS. The DCs play therole of the cluster head and perform the cluster formationprocess at every pause location.Whenever the average energyof the nodes in the one-hop region of the DC is belowthe threshold value, the DC need to select its next posi-tion from its candidate list such that the average residualenergy is the highest and the reliability R is satisfied.But multi-hop transmissions from sensor nodes far awayfrom the DCs may lead to high energy consumption tothe network. Also, the frequent changes of the positions ofDCs may incur a large overhead for the network topologymaintenance.

2) CLUSTER BASED APPROACHESProtocols falling to this sub-category always take the advan-tage of some existed clustering algorithms to organizethe sensor network into different clusters. The networkmay be homogeneous or heterogeneous. Proper clusterheads (CHs) need to be selected for data aggregation anduploading [8], [10], [12], [24]–[30]. Cluster members uploadtheir data to the CHs and the MEs collect data from the CHsat the higher layer.

An efficient energy-aware distributed intelligent datagathering algorithm (DIDGA) is presented in [24], whichincludes cluster formation and path formation phases.DIDGA constructs a minimum connected dominatingset (MCDS) based on maximal independent sets (MISs) toform a cluster, and selects the node with more power tobe cluster head to prolong the network lifetime. Then, themobile collector needs to communicate with each clusterhead to gather sensed data. A path formation optimizedalgorithm (PFOA) combines ant colony algorithm andevolutionary algorithm is introduced for the mobile collector

384 VOLUME 3, 2015


to solve the TSP problem and to satisfy the time-limitedconstraints.

The authors in [25] adopt a distance based rangeconstrained clustering (RCC) algorithm to organize sensornodes into groups. The cluster center is defined as stop pointsfor the MDC. The tour of MDC is modeled as a TSP on thestop points. To reduce the data gathering latency, theRCC algorithm tries to obtain the minimum number of stoppoints but with full coverage. Moreover, a Edge-Fold (EF)heuristic is developed to get a balance between increasing thetour length and utilizing the wireless link, which further bringdown the average packet delay.

In [8], a metric-based distributed clustering algorithm ispresented for the lossy network. The authors have studiedboth one-hop and k-hop clustering algorithms. In thisalgorithm, both link qualities and residual energy are jointlyconsidered in calculating the weight of each node for select-ing proper CHs. However, the selected CHs are responsiblefor receiving all data of its members. It may lead to a fastdepletion of their energy and storage space.

Some literatures may take the advantage of a heteroge-neous architecture for WSNs composed of a few resourcerich sensor nodes and a large number of simple static sensornodes. The advanced nodes can help relieve sensors that areheavily burdened by high network traffic, thus extendingthe network lifetime. In [26], the network is assumed to beheterogeneous with some advanced sensor nodes and normalnodes. A mobile sink based improved stable election (MSE)algorithm is introduced. The mobile sink moves back andforth along a fixed center line of the sensing field. Theselection of cluster heads is based primary on the fractionof advanced nodes with additional energy, and secondary onthe ratio between residual energy and initial energy.Non-cluster head nodes need to affiliate themselves to thecluster heads with strongest received signal strength (RSS).TheMSE protocol can achieve energy balancing and networklifetime prolonging, but the fixed trajectory may not suitableall the time.

In [12], a rendezvous-based data dissemination protocolbased on multi-hop clustering (RDDM) is proposed.It exploits a Backbone-based Virtual Infrastructure (BVI)which uses only CHs for data routing. The RDDM providesenough backup nodes to substitute for a CH and enoughbackup paths between neighbor CHs, which provide highrobustness against node and link failures. Moreover, whena source node wants to upload its packet to the destination, ashortest path will be found by the RDDM through signalingonly between neighbor CHs. It can reduce the energyconsumption. But the CHs may depletion their energy firstdue to their high burden.

The authors of [27] also construct a BVI which composesclusters of the sensor nodes and connects the cluster heads asa shared tree. The clustering algorithm used in this protocolis KOCA [28], which aims at generating connected overlap-ping multi-hop clusters in WSNs. The protocol can providethe shortest path from a source node to the mobile sink by

exchanging routing tables between neighbor cluster heads.A mobile sink only registers the cluster ID of the currentstanding cluster into the root of the shared tree and finally,the source delivers data to the current standing cluster via theshortest path by the routing table.

A virtual grid based dynamic routes adjustment (VGDRA)scheme is presented in [29], which also constructs a BVI fordata collection. It divides the sensing field into K equal sizedcells, and selects nodes close to the center to be cell headers.The BVI is constructed by the cell headers. The mobile sinkmoves along the periphery of the field and communicateswith the border cell headers for data collection. To reduce thecommunication cost, only a subset of cell headers needs totake part in re-adjusting their routes to the latest location ofthe mobile sink.

In [10], the authors build a cluster structure and proposea distributed protocol which only selects nodes withsufficient energy and in close proximity with the mobilesink (MS) for sufficient long time as RendezvousNodes (RNs). The clustering scheme used in this algorithmconstructs a multi-size cluster structure, where the size ofeach cluster is inversely proportion to the distance of itscluster head from the MSs trajectory. By clustering andselecting the proper number of RNs, it can minimize the dataloss and achieve great data throughput. Also, the replacementof RNs can lead to a balanced energy consumption of thenetwork. But its application scenario is limited to non-realtime applications.

Similar to [10], the authors of [30] preserve the benefitsof the LEACH algorithm to form a cluster and improveCH selection by considering the residual energy of sensornodes. They also combine themwith the concepts of a mobilesink and rendezvous nodes (RNs). The selection of RN is alsobased on the distance from the MS trajectory. The proposedprotocol can decrease the energy consumption in large-scalenetworks.

3) AREA BASED APPROACHESIn this category, the MEs visiting different areas for datacollection have different effects on the network performance,such as load balance [31], [32] and decreased data gatheringlatency [33]. Sensor nodes in other areas need to transmit theirdata to these visited areas via multi-hop communication.

The authors of [31] propose an Energy-Balanced DataCollection (EBDC) mechanism. It considers a circularmonitoring region, in which sensor nodes in different ringshave different data-relayingworkloads. To balance the energyconsumption, the mobile sink needs to move along differenttracks with different number of sweep repetitions to collectdata. The number of sweep repetitions in each track is calcu-lated by the energy consumption of each sensor node whenthe mobile sink move along that track.

Similar to [31], in order to balance the non-uniform energyconsumption of sensor nodes in a circular area, the authorsof [32] present a clustering-based routing algorithm inwireless sensor networks with mobile sink (CRA).

VOLUME 3, 2015 385


In the CRA, the mobile sink moves around a specified circleand the circular area near this trajectory is defined as bufferarea. The buffer area has the largest load in the network, soin order to achieve maximum network lifetime, the main taskin the CRA is to determine the radius of the buffer area andthe radius of the moving circle of the mobile sink. Moreover,an improved algorithm CRA-1 is also introduced by usingdensity controlling to further balance the network load.

A biased sink mobility with adaptive stop times is studiedin [33], as a method for efficient data collection in wirelesssensor networks. To achieve a balance of energy consumptionand data gathering latency, when selecting the next area tovisit, the mobile sink favors less visited areas to cover thenetwork area faster, and adaptively stops more time in someregions that tend to produce more data.

E. SUMMARYIn summary, as listed in Table 1, direct communicationapproaches incur an extremely long data gathering latency,while some multi-hop communication for local datacollection approaches such as agent-based approaches andother query/data-driven approaches still encounter the prob-lems of non-uniform energy consumption and large controloverhead due to long relaying hop distance. So it is necessaryto jointly consider the relay hop distance and the tour lengthof theMEswhen perform full data collection. In some cluster-based approaches, a CH buffers the data packets from itscluster members till the ME comes in contact, which maylead to fast depletion of the CHs due to the heavy burden.The proposed area based approaches are only feasible tosome special network architectures with non-uniform loaddistribution.

TABLE 1. Related work summary.

4 0

The PP/RP based approaches can jointly consider manydecisive factors such as load balance, energy consumption

and data gathering latency for improving the networkperformance when selecting the suitable moving locationsof the ME. Moreover, it can dynamically adapt to differentnetwork environment and different application requirements.To this end, our work in this paper belongs to this type andlearns frommany relevant researches for constructing the net-work structure and selecting appropriate RPs. Furthermore,the SPT-DGA [3] is typical of these PP/RP based approaches,and it is similar to our TCBDGA in many aspects. It alsoconstructs a tree topology from the beginning, and thendivides these trees by selecting some appropriate PPs onthem. The selected PPs are also regarded as sojourn locationsfor the ME to visit and perform data gathering. Moreover,to make a balance between energy consumption and datagathering latency, the selected PPs in both the SPT-DGAand our TCBDGA need to link as many sensor nodes aspossible with a limited relay hop bound. Ourwork outperformSPT-DGA in some aspects, which will be validated by thesimulation in the section of Simulation and Analysis.

III. NETWORK AND ENERGY CONSUMPTION MODELTheWSN for our TCBDGA consists ofN static and homoge-neous sensor nodes deployed randomly over an L× L squarearea, as shown in Fig. 1. There is a static BS located at thecenter of the distributed area. Amobile sinkmoves in the fieldperiodically from the BS to collect data. The mobile sink isassumed to be with unlimited energy, memory and computingresources. All sensor nodes are initially provisioned with thesame amount of energy E0 and location-aware abilities byusing localization technology [34], [35], and the mobile sinkknows all nodes’ locations. Each node has a limited commu-nication range Rs, and communication is always successfulif other node is within its transmission range. Each of themgenerates data with the same rate and sends them only oncein each data collection round. We assume that the wholedeployment area is full covered and well connected.

FIGURE 1. Network model.

The energy consumption of a sensor includes the power forsensing, receiving, and transmission. In this paper, we ignore

386 VOLUME 3, 2015


the energy consumption for sensing since it is negligiblecompared with the other two types of energy consumption.The energy consumption of one node for sending or receivinga message with K bits is calculated by the following formula:

ETx(K , d) = Eelec × K + εamp × K × d2 (1)

ERx(K ) = Eelec × K (2)

Where d is the physical distance between two sensornodes, and Eelec is the energy consumption factor indicatingthe power per bit incurred by the transmitting circuit orthe receiving circuit, εamp represents the coefficient for theamplifier to send one bit.

IV. TCBDGAA. OVERVIEWIn our proposed tree-cluster based data gathering algorithm,which is named TCBDGA for short, a mobile sink starts thedata collection tour periodically from the BS, stops at eachrendezvous point, collects data from the nodes in its one-hoprange directly, and finally, return back to the BS for oneround. The whole algorithm consists of three stages, treeconstruction, RPs and SRPs selection, and data collection.In tree construction stage, each node is weighted by itsresidual energy, its distance to the BS, the number of itstwo-hop neighbors and the average residual energy ofits one-hop neighbors. By communicating with its one-hopneighbors and comparing their weight, every node treats theneighbor with the maximum weight as its parent node. Thus,the data-gathering tree can finally be constructed. In RPs andSRPs selection stage, every root node of the trees is consid-ered as a Rendezvous Point. And each tree is decomposed intoa set of sub-trees according to its depth and its traffic load.Then Sub-Rendezvous Points (SRPs) can be found in thesub-trees for further load balancing. In the data collectionstage, the positions of RPs and SRPs are considered as thestop points of the mobile sink. Every sensor node will trans-mit its data to the corresponding one-hop neighbors of itsassociated RP or SRP, the latter will store and upload themto the mobile sink when it arrives.

In data collection stage, when the mobile sink visits eachRP/SRP every round, the latter decides whether or not to senda tree-reconstructing request according to the residual energyratio of its one-hop neighbors. If the condition is met, thecorresponding RP/SRP sends the request to the mobile sinkafter it arrives. Finally, the sink decides whether to performtree reconstruction according to the proportion of the numberof requests to the number of RPs and SRPs.

B. TREE CONSTRUCTIONThe main goal of selecting rendezvous points is to alleviatethe ‘‘hotspot problem’’ so as to achieve load balancing, andalso to balance the tradeoff between the energy consumptionand data gathering latency. So the residual energy and averageenergy of its one-hop neighbors should be focused on whenchoosing the parameters of one node’s weight to avoid the fast

depletion of the heavily loaded nodes. Meanwhile, in order toget theminimum number of stop points of themobile sink andminimize its tour length, sensor nodes with more neighbornodes and closest to the BS should have a higher priority tobe a rendezvous point. In addition, the authors of the literature‘‘Bounded Relay Hop Mobile Data Gathering in WirelessSensor Networks’’ [3] have proved that in order to pursuea tradeoff between the energy saving and data gatheringlatency, which achieves a balance between the relay hop countfor local data aggregation and the moving tour length of themobile sink, the relay hop count d should be constrained to asmall level (2 or 3) to limit the energy consumption at sensors.Therefore, in this paper we only consider the number oftwo-hop neighbors of a node to calculate its weight.

1) PRELIMINARIESIn the beginning, every node needs to calculate the num-ber of its two-hop neighbors and its corresponding weight.Therefore, each node should firstly broadcast a BRO_MSG_Tmessage, which contains its ID number, residual energy Erand a time to live TTL value. The initial value of TTL is setto 2, when a node receives a BRO_MSG_T, it forwards themessage to its neighbors if TTL has not reached 0. Thus, everyBRO_MSG_Twill only be received by its two-hop neighbors.After broadcasting, node i counts the number of its neigh-

bors in its one-hop range N 1−hopi and two-hop range N 2−hop

irespectively, according to the value of TTL and the numberof the BRO_MSG_T it received. Meanwhile, for node i, theaverage residual energy of its one-hop neighbors can becalculated through the formula:

E1−hopi = (

N 1−hopi∑1

Er )/N1−hopi (3)

We also define the selection weight of node i as follows:

Wi = N 2−hopi × Ei × E

1−hopi /Di (4)

Ei denotes the residual energy of node i, and Di denotes itsdistance to the BS which can be calculated by its coordinate.

2) PARENT NODE SELECTIONBy exchanging weight information derived by the equationabove with its one-hop neighbors, every node maintains a listof weights of its one-hop neighbors. By comparing its ownweight with its one-hop neighbors’, every node selects itsappropriate parent node respectively in accordance with thefollowing rules:• If there are neighbor nodes whose weight are bigger thannode i, then node i chooses the node with the maximumweight as its parent node, and node i itself is the childnode. Here, if more than one neighbor nodes have themaximum weight, node i chooses the node with thebiggest ID number as its final parent node.

To better understand this case, we illustrate an examplein Fig. 2. There are six sensor nodes. Node 1, 3, 4, 5, and 6are the one-hop neighbors of node 2. Here, we assume that

VOLUME 3, 2015 387


FIGURE 2. Example of selecting parent node with a bigger ID number.

the weight values of node 3 and node 5 are equal and thehighest among them. In this case, node 5 will be chosen asthe parent node of node 2 according to the rules above.If node 3 is selected as the parent node of node 2, thennode 3 may regard node 5 as its parent node, which leads thetransmission path of node 2 to increase onemore hop. In orderto avoid this situation, each sensor node needs to choose theneighbor with a bigger ID number as its final parent nodewhen more than one neighbors have the maximum weightvalue.• If none of the one-hop neighbor nodes of node i regardsnode i as its parent node, and the maximumweight valueof them is less than the weight of node i, then node iis an isolated node. To avoid such situation, node i stillchooses the neighbor node with the highest weight as itsparent node.

To illustrate this situation, we give an example in Fig. 3,where node 2, 3, 4, and 5 are the one-hop neighbors of node 1.We assume that the weight of node 5 is the highest amongthese one-hop neighbors of node 1, but it is lower whencompared with the weight of node 1. It’s assumed that theparent node of node 2, 3, 4, and 5 are node 9, 8, 7, and 7 asillustrated in Fig. 3. None of the one-hop neighbors of node 1will regard node 1 as its parent node, in this case, node 5 willbe chosen as its parent node to avoid node 1 becoming anisolated node.

FIGURE 3. Example of selecting proper parent nodes.

• If node i finds that node j is the one-hop neighbor nodewith the maximum weight, and the weight value ofnode i and node j are the same, then node i compares

the ID number of node j with its own. If i < j, thennode i finally chooses j as its parent node to avoid thesituation that both of them regard the other as its parentnode, thus, a closed loop will not be formed. If i > j,node i does not need to find a parent node any more if ithas a child node already, but if node i has no child nodes,the rule 2 above needs to be adopted to avoid node i tobe an isolated node.

Based on these rules, each tree eventually has one and onlyone node whose child nodes set is not empty, but none of itsneighbor nodes can be chosen as its parent node. Finally, thetree is set up.

3) ROOT NODE AND LEAF NODE IDENTIFICATIONThe sensor nodes in TCBDGA can be divided into threecategories: root nodes, leaf nodes, and normal nodes. Whennode i finds that it only has child nodes and no parentnode, it marks itself as a root node. When node i findsthat it only has a parent node and no child nodes, it marksitself as a leaf node. Other nodes are normal nodes. Even-tually, each tree only has one root node, and every nodeexcept root node on the tree has only one parent noderespectively.

4) HOP COUNT CALCULATIONAfter root nodes are marked, each tree can be identified bythe ID number of its root node. Each root node broadcastsa message BRO_MSG_H to all its child nodes, whichcontains two parts, the root ID and the hop level h. Theinitial value of h is set to 0, and it is increased by one whenthe BRO_MSG_H is received by a normal node or a leafnode. Every BRO_MSG_H will be forwarded until it reachesa leaf node. According to the information of BRO_MSG_H itreceived, every normal node, as well as leaf node, confirmsits associated tree (the ID number of root node) and calculatesits hop level to the corresponding root node. Also, the numberof child nodes of each normal node and root node can becounted on the basis of the relationship between parent nodesand child nodes.

Now the tree construction phase is finished, and the wholenetwork is divided into many tree clusters. Each tree isidentified by the ID number of its root node at the top ofthe tree. Due to the forwarding relationship between childnode and parent node of the tree, we can intuitively draw aconclusion that the root node may have the highest weightin the tree. Consequently, the residual energy, the averageresidual energy of their one-hop neighbors of the root nodesmay be the highest, and the number of their neighbors may bethe most. Moreover, each node on the tree knows its distanceaway from its corresponding root node and can calculate thenumber of its child nodes respectively. However, the depthof these trees are not under control. Some normal nodes mayhave a large number of child nodes and may be far away fromthe corresponding root nodes, which can cause heavy loadand traffic flow, and greatly influence the performance of thewhole network.

388 VOLUME 3, 2015


C. RENDEZVOUS POINT AND SUB-RENDEZVOUS POINTSELECTION1) RENDEZVOUS POINT SELECTIONAs mentioned in the tree construction phase, root nodes havethe highest weight in the tree. Therefore, compared with othernodes, root nodes may have the highest residual energy andthe maximum number of neighbor nodes. And the averageresidual energy of their one-hop neighbors may also be thehighest. Hence, it is naturally to select root nodes asRendezvous Points (RPs) for data collecting.

Since the neighbors of root nodes have relatively higherenergy than the others on the same tree, when the root nodesare selected as RPs, these RPs can avoid early death due to theheavy load and traffic flow, which in turn leads to a balancednetwork load and uniform energy consumption distribution.In addition, when the mobile sink traverses the network andstops at these RPs to collect data, the denser region can firstlybe visited, which shortens the data gathering latency in a way.

In this paper, we assume that each sensor forwards itssensory data to its corresponding parent node, all the datais transmitted in the direction of RPs. However, when thedepth of a tree is too large, which implies that some normalnodes may have a heavy traffic load (i.e. too many childnodes) and a large hop count to the corresponding RPs, thehotspot problem of the nodes closer to the RPs and theunbalanced energy consumption still exist. Therefore, inorder to alleviate the burden around the RPs and achieveuniform energy consumption, it’s necessary to decompose thetrees with large depth and heavy load into several sub-treesand select appropriate nodes to serve as Sub-RendezvousPoints (SRPs) in the sub-trees. The details are introduced inthe next section.

2) SUB-RENDEZVOUS POINT SELECTIONThemain objective of selecting Sub-Rendezvous Point (SRP)is to prevent sensor nodes with larger hop count to RPs andheavier traffic flow from forwarding their received data inthe direction of RPs, but in the direction of SRPs. It furtherminimizes the energy consumption of each sensor due tomulti-hop transmissions. Sowe jointly consider the hop countand the number of child nodes of each normal node to selectsuitable nodes as SRPs and decompose the trees.

Each node obtains some important information accordingto the BRO_MSG_H they received. It is listed below:• Its hop count to the corresponding RP, i.e. the hoplevel H (i,RP);

• The number of its child nodes CN (i);• The amount of data it forwards DS(i) = (CN (i) + 1) ×k bit, where k indicates the amount of data every singlenode sensed for a specified period.

If H (i,RP) > 2 and CN (i) ≥ 2, node i regards itselfas a candidate SRP, then it calculates its cost weightCW (i) = H (i,RP) × DS(i). Hence, sensor nodes that arefarther away from the selected RPs or have more than onepacket in their buffers have a higher priority of being recruitedas an SRP. All candidate SRPs exchange information with

their two-hop child nodes for cost weight comparison.The rules are as follows.

In the same sub-tree, if one of the candidate SRPs finds thatits parent node is not a candidate SRP but one or more of itschild nodes are candidate SRPs, then these SRPs will com-pare their cost weight parameter and the node with highestcost weight is selected as the final SRP of the tree. If their costweights are the same, then the ancestor node with less hopcount (i.e. closer to its corresponding RP) will be recruitedas the final SRP. If these candidate SRPs are within two hopsaway from each other, then all of them will regard themselvesas the final SRPs.

As illustrated in Fig. 4, sensor node 6 and 7 is candidateSRPs. But only node 7 is recruited as final SRP for the reasonthat its cost weight is higher than that of node 6 in accordancewith the above rules.

FIGURE 4. Example of constructed tree structure.

After all SRPs are identified, the tree decompositionoperation is performed. And, sub-trees are set up. Thus, nodeswith higher traffic load and larger hop count need not toforward data in the direction of their corresponding RPs.It reduces the number ofmulti-hop transmissions, and therebyminimize the energy consumption.

D. DATA UPLOADING, STORING AND DATA COLLECTION1) DATA UPLOADING AND STORINGAfter all RPs and SRPs are selected, the mobile sink beginsto perform its data gathering. When mobile sink moves to aRP or SRP, it polls each sensor within its transmission range.The sensors that receive the polling messages uploadpackets directly to the mobile sink via single-hop transmis-sions instead of sending large amount of data they receivedto the corresponding RPs or SRPs, which reduces the burdenof RPs and SRPs significantly. Thus, child nodes in theone-hop range of RPs or SRPs need to be identified beforedata gathering process. After that, other nodes only need toupload the packets to its corresponding parent node. So allpackets will be forwarded in the direction of the correspond-ing RPs or SRPs until being received and stored by theone-hop child nodes of RPs or SRPs.

Hence, RPs and SRPs only need to send a notification totheir one-hop child nodes to inform them to be data storagenodes (DSNs) at first. When packets from child nodes havebeen forwarded to these DSNs, the data uploading process

VOLUME 3, 2015 389


stops. Other nodes except these DSNs, may be two or morehops away from their corresponding RPs or SRPs, transmittheir data to their corresponding parent nodes respectively.Obviously, all packets are stored by these DSNs and then canbe uploaded to the mobile sink when it arrives.

To store all data of a tree in these DSNs, we can not onlytake advantage of the high residual energy of them, but alsoreduce the heavy load of the RPs or SRPs caused by receivingtoo large amount of data from the whole tree.

2) DATA COLLECTIONA data collection stage consists of many data collectionround. After each data collection, the mobile sink decideswhether to reconstruct the whole network trees based on someconditions. The details is discussed in the next subsection of‘‘RPs and SRPs Re-selection’’. In each data collection round,the mobile sink needs to move in the network deploymentarea periodically to perform data gathering. The selectedRPs and SRPs report their location information to theBS firstly. After getting these location information from BS,mobile sink invokes an approximation algorithm for theTSP (Traveling Salesman Problem) to plan an approximatelyoptimal route for these RPs and SRPs. It then moves to thesenodes one by one to gather data from sensors in the prox-imity via single-hop transmissions. When arriving at theseRPs or SRPs, the mobile sink firstly sends a polling messageto these DSNs and RPs or SRPs in its one-hop range, stopsat these RPs and SRPs for data gathering, and then moves toanother position until all data of the tree have been uploaded.In this manner, the mobile sink visits all the trees and finishesits one data collection round.

E. RPs AND SRPs RE-SELECTIONAs discussed above, the energy consumption of DSNs signifi-cantly influences the network lifetime, since they have almostthe heaviest load of their corresponding trees. Hence in eachdata collection round, after all data of the correspondingtree of the RPs or SRPs is uploaded to the mobile sink, theRPs or SRPs retrieve the residual energy of their one-hopchild nodes (i.e. the DSNs), and then decide whether to senda request of performing tree reconstruction to the mobilesink. The mobile sink calculates whether to perform treereconstruction according to the ratio of the accumulate valueof requests to the number of RPs and SRPs. The details areas follows:

1) In each data collection round, after all data of thecorresponding tree is received by the mobile sink, thecorresponding RP or SRP retrieves the residual energyof its one-hop child nodes. If at least one of these nodeshas residual energy below m% of its initial energyrecorded in the first data collection round after thetree constructing or reconstructing, then the RP or SRPneeds to send a request of tree reconstruction to themobile sink before it moves to next position.

2) After these RPs and SRPs are visited by the mobile sinkthe mobile sink calculates the ratio of the accumulate

value of requests to the number of RPs and SRPsbefore the next tree reconstruction stage. If it is higherthan n%, then the tree reconstruction is performed andnew RPs and SRPs are selected as the stop points of themobile sink for data gathering. Otherwise, in the nextround the mobile sink still traverses the originalRPs and SRPs of last round for data collection.

3) Every time after tree reconstruction, the new RPsand SRPs need to retrieve the residual energy of itsone-hop child nodes and record it in the first datacollection round. Then the value is a benchmark forthese RPs and SRPs to make a decision about whetherto send a tree reconstruction request to the mobilesink after each data collection round. And the mobilesink continues to calculate whether to perform a treereconstruction based on the value of m and n.

From the procedures above, we believe that there is atradeoff between the value of m and n and the cost of treereconstructing. If m is too large, which means RPs or SRPsneed to send a request when the residual energy of the DSNsis still very high, in order to reduce the cost of frequently treereconstruction, n needs to be set with a large value, whichimplies that the mobile sink performs tree reconstructionwhen it receive a large proportion of requests. On the contrary,when m is set too small, which means the residual energy ofthe DSNs is very low, in order to avoid the fast depletion ofthem, n should be constrained to a small value to make themobile sink perform tree reconstruction as soon as possible,which means to sacrifice the cost of tree reconstruction toprolong the whole network lifetime. We validate this bysimulations in the next section.

V. SIMULATIONS AND ANALYSISIn this section, in order to evaluate the performance of ourTCBDGA and compare with other algorithms, we implementtwo different sets of simulations using the well-known toolMATLAB. First, we perform a set of experiments to validatethe network performance with different values of m and n(introduced in the last section) under the same simulationconditions, and select a pair of m and n with approximatelythe best network performance for comparison with otherschemes. Then we adopt the derived m and n for tree recon-struction in our TCBDGA and compare its performance withour TCBDGAwith no reconstruction, and with the SPT-DGAunder different scenarios.

A. SIMULATION MODELTable 2 summarizes the important simulation parameters andtheir values. In our simulations, sensor nodes are randomlydeployed in a square area with the size of L×L, and the valueof L varies from 100m to 500m. The number of the staticsensor nodes N increases from 100 to 500. All the sensornodes have the same communication range Rs and the sameinitial energy E0. Also, there is a BS located at the centerof the deployed area. Other parameters are listedin Table 2.

390 VOLUME 3, 2015


TABLE 2. Simulation parameters.

In all of our experiments, sensor nodes use the CSMAprotocol for channel access and broadcasting the controlpackets. In each data collection round, each sensor node gen-erates a fixed size packet and sends it to its parent. Note thatthere are two different sizes of packets in our TCBDGA, oneis the messages of each node exchanged with their neighborsto select appropriate parent and to perform tree constructionor reconstruction, which is used for calculating the expenseof the tree reconstruction. The other type of the packets isthe sensory data of each node, which is transmitted to theircorresponding parent node, and then is uploaded to themobilesink. In our experiments, we define these two types of packetsas k0 and k1 respectively, and the value of them are listed inthe Table 2. Moreover, we do not consider the packet lossdue to collision and interference at the MAC layer for thereason that collisions among transmissions can be avoidedby assigning different time slots to each of the child nodes.Also, sensing and computation cost for data aggregation areconsidered to be negligible in our simulations.

During the data gathering process of our experiments, weadopt the nearest neighbor (NN) algorithm for the problemof TSP [14] to determine an approximately optimal tour forthe mobile elements in both our proposed TCBDGA and theSPT-DGA, which allows the mobile elements to start fromthe BS and choose the nearest unvisited RPs or PPs for thenext visit, and eventually move back to the BS.

To verify the network performance, we choose the follow-ing performance metrics:• Network Lifetime: The number of the data collectionround until the first sensor node of the network dies as aresult of depleting its energy resources.

• Average Tour Length: The average tour length of themobile element that starts from the BS, traverses eachof the selected RPs or PPs using the NN algorithm andfinally returns back to the BS in each of the data

collection rounds. It represents the data gatheringlatency of the corresponding algorithm.

• Average Times of Reconstructions: The average timesof reconstructions in each data collection stage of ourproposed TCBDGA with different values of m and n.

B. PERFORMANCE OF TCBDGA WITH DIFFERENTVALUES OF m AND nWe conduct a set of experiments of our proposed TCBDGAwith different pairs of m and n to evaluate the influence ofthem on the network performance.We assume that 200 sensornodes with the same initial energy E0 and the same transmis-sion range 30m are distributed randomly and uniformly in a200m×200m square area. And a BS is located at the center ofthe square area. Each sensor node generates a fixed size(i.e. k1 = 2000) of packet and uploaded it to its correspondingparent node in each data collection round. Other parametersare listed in the Table 2.

As described in Section IV, the variation of the ratiom and nsignificantly influences the frequency of tree reconstruction,which leads to different energy consumption of the network.To verify the difference, we set m from 0 to 100, andn to 20, 40, 60, 80, and 100, respectively. The averagetimes of reconstructions and network lifetime are illustratedin Fig. 5 and Fig. 6, respectively.

FIGURE 5. Average times of reconstructions of TCBDGA with differentpairs of m% and n%.

From Fig. 5, we can clearly see that the average times ofreconstructions increases monotonously as m increases, nomatter what the value of n is. It is easily to understand for thereason that m represents the residual energy ratio of DSNs.When the residual energy ratio of a DSN is below m%,it sends a request of reconstruction to the mobile sink. There-fore, when m is set to a small value, these DSNs sends thereconstruction requests until they consume a large amount ofenergy, thus leading to a low frequency of reconstructions.However, when m becomes larger, the mobile sink receives alot of reconstruction requests and perform tree reconstructionfrequently when the DSNs still have enough energy.

VOLUME 3, 2015 391


FIGURE 6. Network lifetime of TCBDGA with different pairsof m% and n%.

Fig. 6 shows the network lifetime performance of ourTCBDGAwith different pairs ofm and n, and we can see thatit increases from the beginning with the increasing value ofm and then drops sharply whenm is above 50. That’s becausethe average times of reconstructions influences the networklifetime greatly. As illustrated in Fig. 5, when m is set to asmall value, only few reconstructions need to be performed,which alleviates the load of these nodes with heavy burdento some degree. However, a large value of m (i.e. above 50)incurs frequent tree reconstructions. It obviously causes ahuge expense of energy consumption. When the cost of treereconstruction plays the major role of energy consumptioncompared with other communication costs, it influences thenetwork lifetime most and makes it drop sharply.

Moreover, we can also figure out the relationship betweenm and n from Fig. 6.Whenm is below 50, it is obvious that thenetwork lifetime is longer when n is smaller (such as n = 20).While, whenm is above 50, it is the opposite. It is reasonable,because if m is set too small, which means the DSNs depletethe majority of their residual energy. In order to avoid the fastdepletion of their energy, n should be constrained to a smallvalue to make the mobile sink perform tree reconstructionas soon as possible. It means to sacrifice the cost of treereconstruction to prolong the whole network lifetime. Whenm becomes larger, RPs or SRPs need to send a request whenthe residual energy of the DSNs is still very high. In orderto reduce the cost of frequently tree reconstruction, n needsto be set to a larger value to make the mobile sink performtree reconstruction only when it receives a large proportionof requests.

At the same time, there are three particular situations. Theyare m = 0, 50, 100 in Fig. 6. For the case m = 0, it meansthat our TCBDGA only performs tree reconstruction whenevery DSN depletes its energy, which implies to constructthe tree only once before the first round and then with noreconstructions any more, the network consume energy onlyby transmitting or receiving data in each of the data collection

round until the network dies. For the case m = 100, it meansthat the tree reconstruction is performed after every datacollection round until the network dies. For the case m = 50,we can see the network lifetime is the longest, so we justselect the approximately optimal pair of m = 50 and n = 60for our TCBDGA in comparison with other algorithms in thenext sub-section.

C. PERFORMANCE COMPARISON UNDERDIFFERENT PARAMETERSThe SPT-DGA [3] is similar to our TCBDGA in some way,such as the tree topology it builds and the method it uses toselect a set of optimal PPs. Also, the selection of PPs from theconstructed tree in SPT-DGA is just like tree decompositionand the selection of RPs and SRPs in our TCBDGA. And,to select these stop points for the mobile elements, bothSPT-DGA and TCBDGA have limited the relay hop boundto a small value to minimize the energy consumption ofdata transmissions. In addition, both of these two algorithmschoose a typical set of sensors for the mobile elements to visitusing the same NN algorithm for the TSP problem.

However, the SPT-DGA has some limitations, such as:• It does not consider the energy factor of each node whenselecting PPs;

• There is no strategy for the replacement of theselected PPs, while these PPs are responsible for receiv-ing large number of packets from their child nodes.It may lead to a fast energy depletion of the PPs if theyare not replaced after a few rounds.

In this sub-section of simulations, both of these twoalgorithms consider sensor nodes with the same initial energydeployed over an L × L square area, and there is a BSlocated at the center of the deployed area. To verify theperformance of our TCBDGA compared with the SPT-DGA,three scenarios are conducted by adjusting three parameters,the number of sensor nodes, transmission range and the sizeof the network. The number of nodes varies from 100 to 500,the transmission range Rs varies from 10m to 50m, and theside length of the network L is set from 100m to 500m. Otherparameters are the same as that in Table 2.

Also, the relay hop bound d in the SPT-DGA is set to 2when compared with our proposed algorithm. We selectthe pair of m = 50 and n = 60 in our TCBDGA forcomparison since the network lifetime is approximately thelongest. We analyse the network lifetime performance andthe average tour length of TCBDGA from two aspects. Oneis the comparison with SPT-DGA, which can be treated asa data gathering algorithm without reconstructions, and theselected PPs are not changed in every data collection rounduntil the first node dies. The other one is the comparison withitself, but without reconstructions, which can be obtained bysetting m = 0 as discussed in the sub-section above.Hence, there are three algorithms to compare:• TCBDGA (m = 50, n = 60)• SPT-DGA (d = 2)• TCBDGA (no reconstructions)

392 VOLUME 3, 2015


1) THE IMPACT OF THE NUMBER OF NODES NFig. 7 and Fig. 8 plot the performance of the three algorithmsas a function ofN .N varies from 100 to 500, L is set to 200m,and the transmission range Rs is equal to 30m.

FIGURE 7. Comparison of network lifetime.

FIGURE 8. Comparison of mobile sink average tour length.

From Fig. 7, we can see that the network lifetime of ourTCBDGA is almost 9.4 times than that of SPT-DGA whenN is set to a small number 200, and even the TCBDGAwith no reconstructions has a longer lifetime than that ofSPT-DGA. As the number of nodes increases, the networklifetime in our TCBDGA decreases, but always longer thanthat of SPT-DGA. There are three reasons accounting for thisphenomenon. First, we take the energy factor into consid-eration when selecting parent node and RPs, only nodes ortheir neighbors with higher energy can be recruited as RPs.Secondly, the selected RPs need not to receive large amountsof data compared with the SPT-DGA, which further reducethe burden of the RPs, SRPs and the DSNs around them.Finally, we adopt the adjusting strategy of RPs and SRPs,and the tree reconstruction operation balances the load of thewhole network to a degree. All these three aspects contributeto prolonging the network lifetime.

Fig. 8 shows the average tour length of the threealgorithms. When N is small, the SPT-DGA performs a littlebetter than TCBDGA, but when N becomes larger, the aver-age tour length of our TCBDGA decreases. That’s becausethat we consider the node density when selecting parent nodeof each node, so that the selected number of RPs is lessthan that of SPT-DGA. This significantly influences the tourlength when the number of nodes becomes larger. Also, wecan see our TCBDGAwithout reconstructions always has theshortest tour length about 640m compared with the other twoalgorithms, but we do not regard it as the best one for thereason that it does not consider the load balancing of thewhole network. The result implies that it reduces the datagathering latency by sacrificing the energy saving.

2) THE IMPACT OF THE TRANSMISSION RANGE Rs

To verify the effect of transmission range of nodes Rs onthe network lifetime and average tour length of each datacollection round, N is set to 200 and L is equal to 200mwhenRs varies from 10m to 50m.Fig. 9 shows the network lifetime performance of the three

algorithms as a function of Rs. Obviously, we know moresensor nodes can become neighbors to each other asRs increases. As a result, the energy consumption of datatransmitting or receiving have a rise due to the increasing ofchild nodes of each RPs and SRPs. It leads to a decreasingnetwork lifetime. However, the network lifetime of ourTCBDGA is always as twice as that of SPT-DGA, for thereason that our TCBDGA adopts the adjusting strategy forRPs and SRPs. Even TCBDGA without reconstructionsperform a little better than SPT-DGA for that it considers theenergy factor for selecting RPs.

FIGURE 9. Comparison of network lifetime with varying transmissionrange Rs.

Fig. 10 illustrates the average tour length of the threealgorithms as a function of Rs. As with the increasingof Rs, each of the RPs or SRPs in our TCBDGA or PPs inthe SPT-DGA are able to link more sensor nodes.

VOLUME 3, 2015 393


FIGURE 10. Comparison of average tour length with varying transmissionrange Rs.

Therefore, the average tour length of the mobile elementsis greatly shortened with the reduction of stop points. Butthe tour length of TCBDGA is always larger than that ofSPT-DGA no matter what the Rs is, for example, whenRs is 10m, the tour length of TCBDGA and the SPT-DGAis 2250m and 750m, respectively. When Rs increases to 45m,they are 800m and 400m, respectively. That’s because whenselecting RPs and SRPs in our TCBDGA, in addition tothe distance of nodes to the BS, we also consider theresidual energy of each node, which balance the tradeoffbetween the energy consumption and data gatheringlatency.

3) THE IMPACT OF THE SIDE LENGTH OFDISTRIBUTED AREA LFig. 11 and Fig. 12 further plot the performance of differentalgorithms when the side length L of distributed area varies

FIGURE 11. Comparison of network lifetime with varying side length ofdistributed area L.

FIGURE 12. Comparison of average tour length with varying side lengthof distributed area L.

from 100m to 500m. The number of nodes N is set to 200,Rs is 30m. All other settings are kept unchanged as in theprevious set of simulations.

In Fig. 11, due to the different tree construction strategyof the TCBDGA, the network lifetime has a fluctuation asL increases. But our TCBDGA always performs better in thenetwork lifetime than that of SPT-DGA for the reason thatthe RPs and SRPs are reselected after reconstructions. Also,we consider the residual energy of each node when selectingRPs and SRPs for load balancing in the algorithms ofTCBDGAwith or without reconstructions. It further prolongsthe network lifetime.

From Fig. 12, we can see that as L increases, the averagetour length of all algorithms becomes longer. That’s rea-sonable because sensor nodes are more sparsely distributedas L becomes larger. The number of the stop points in ourTCBDGA and SPT-DGA, and the distance of them isincreased, which leads to a long tour length for the mobileelements. But it is noticed that the tour length of ourTCBDGA and the TCBDGA with no reconstructions arealmost 67 percent and 55 percent longer than that ofSPT-DGAwhen L becomes a larger value such as 400m. Thatis because we consider the energy of each node when select-ing suitable RPs and SRPs, but sacrifice the data gatheringinstantaneity for obtaining a longer lifetime.

VI. CONCLUSIONIn this paper, we investigate recent typical literatures ondata gathering algorithms, and classify them according to thesensed data uploading path or affected area. Based on thecomparison and analysis, we propose a tree-cluster based datagathering algorithm with a mobile sink for WSNs. First, weintroduce a weight-based tree construction method, wherethe weight of each sensor node jointly considers the aver-age residual energy, the distance to the BS, and local nodedensities. After the trees are constructed, RPs are selected.

394 VOLUME 3, 2015


Then a tree decomposition method is proposed. Both thehop count and the amount of data of each node are takeninto consideration when selecting SRPs on sub-trees, whichfurther balances the energy consumption. We also introducean adjusting method for the RPs and SRPs reselection, inorder to alleviate the heavy burden around them. The simu-lation results demonstrate that our TCBDGA can prolong thenetwork lifetime significantly compared with the other datagathering algorithm SPT-DGA. It is more suitable for widearea, especially in industrial environment involving massiveamount of heterogeneous sensory data.

REFERENCES[1] C. Zhu, C. Zheng, L. Shu, and G. Han, ‘‘A survey on coverage and

connectivity issues in wireless sensor networks,’’ J. Netw. Comput. Appl.,vol. 35, no. 2, pp. 619–632, Mar. 2012.

[2] G. Han, X. Jiang, A. Qian, J. J. P. C. Rodrigues, and L. Cheng, ‘‘A compara-tive study of routing protocols of heterogeneouswireless sensor networks,’’Sci. World J., vol. 2014, Jun. 2014, Art. ID 415415.

[3] M. Zhao and Y. Yang, ‘‘Bounded relay hop mobile data gatheringin wireless sensor networks,’’ IEEE Trans. Comput., vol. 61, no. 2,pp. 265–277, Feb. 2012.

[4] M. Ma, Y. Yang, and M. Zhao, ‘‘Tour planning for mobile data-gatheringmechanisms in wireless sensor networks,’’ IEEE Trans. Veh. Technol.,vol. 62, no. 4, pp. 1472–1483, May 2013.

[5] M. Zhao, M. Ma, and Y. Yang, ‘‘Efficient data gathering with mobilecollectors and space-division multiple access technique in wirelesssensor networks,’’ IEEE Trans. Comput., vol. 60, no. 3, pp. 400–417,Mar. 2011.

[6] H. Salarian, K.-W. Chin, and F. Naghdy, ‘‘An energy-efficient mobile-sinkpath selection strategy for wireless sensor networks,’’ IEEE Trans. Veh.Technol., vol. 63, no. 5, pp. 2407–2419, Jun. 2014.

[7] J. Rao and S. Biswas, ‘‘Analyzing multi-hop routing feasibility for sensordata harvesting using mobile sinks,’’ J. Parallel Distrib. Comput., vol. 72,no. 6, pp. 764–777, 2012.

[8] D. Gong, Y. Yang, and Z. Pan, ‘‘Energy-efficient clustering in lossywireless sensor networks,’’ J. Parallel Distrib. Comput., vol. 73, no. 9,pp. 1323–1336, 2013.

[9] S. Vupputuri, K. K. Rachuri, and C. S. R. Murthy, ‘‘Using mobiledata collectors to improve network lifetime of wireless sensor networkswith reliability constraints,’’ J. Parallel Distrib. Comput., vol. 70, no. 7,pp. 767–778, 2010.

[10] C. Konstantopoulos, G. Pantziou, D. Gavalas, A. Mpitziopoulos, andB.Mamalis, ‘‘A rendezvous-based approach enabling energy-efficient sen-sory data collectionwithmobile sinks,’’ IEEE Trans. Parallel Distrib. Syst.,vol. 23, no. 5, pp. 809–817, May 2012.

[11] M.Marta andM. Cardei, ‘‘Improved sensor network lifetime with multiplemobile sinks,’’ PervasiveMobile Comput., vol. 5, no. 5, pp. 542–555, 2009.

[12] E. Lee, S. Park, S. Oh, and S.-H. Kim, ‘‘Rendezvous-based datadissemination for supporting mobile sinks in multi-hop clustered wirelesssensor networks,’’ Wireless Netw., vol. 20, no. 8, pp. 2319–2336, 2014.

[13] C.-F. Wang, J.-D. Shih, B.-H. Pan, and T.-Y. Wu, ‘‘A network lifetimeenhancement method for sink relocation and its analysis in wireless sensornetworks,’’ IEEE Sensors J., vol. 14, no. 6, pp. 1932–1943, Jun. 2014.

[14] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction toAlgorithms. Cambridge, MA, USA: MIT Press, 2001.

[15] R. Sugihara and R. K. Gupta, ‘‘Optimal speed control of mobile node fordata collection in sensor networks,’’ IEEE Trans. Mobile Comput., vol. 9,no. 1, pp. 127–139, Jan. 2010.

[16] X. Zhang, H. Bao, J. Ye, K. Yan, and H. Zhang, ‘‘A data gathering schemefor WSN/WSAN based on partitioning algorithm and mobile sinks,’’ inProc. 10th IEEE Int. Conf. High Perform. Comput. Commun., IEEE Int.Conf. Embedded Ubiquitous Comput., Nov. 2013, pp. 1968–1973.

[17] J.-W. Kim, J.-S. In, K. Hur, J.-W. Kim, and D.-S. Eom, ‘‘An intelligentagent-based routing structure for mobile sinks in WSNs,’’ IEEE Trans.Consum. Electron., vol. 56, no. 4, pp. 2310–2316, Nov. 2010.

[18] J. Yu, E. Jeong, G. Jeon, D.-Y. Seo, and K. Park, ‘‘A dynamic multiagent-based local update strategy for mobile sinks in wireless sensor networks,’’inComputational Science and Its Applications (Lecture Notes in ComputerScience), vol. 6785. Berlin, Germany: Springer-Verlag, 2011, pp. 185–196.

[19] D. Xie, X. Wu, D. Li, and J. Sun, ‘‘Multiple mobile sinks data dissemina-tion mechanism for large scale wireless sensor network,’’ China Commun.,vol. 11, no. 13, pp. 1–8, 2014.

[20] E. Lee, S. Park, F. Yu, Y. Choi, M.-S. Jin, and S.-H. Kim, ‘‘A predictablemobility-based data dissemination protocol for wireless sensor networks,’’in Proc. 22nd Int. Conf. Adv. Inf. Netw. Appl., Mar. 2008, pp. 741–747.

[21] L. Shi, B. Zhang, H. T. Mouftah, and J. Ma, ‘‘DDRP: An efficient data-driven routing protocol for wireless sensor networks with mobile sinks,’’Int. J. Commun. Syst., vol. 26, no. 10, pp. 1341–1355, 2013.

[22] X. Liu, H. Zhao, X. Yang, and X. Li, ‘‘SinkTrail: A proactive data reportingprotocol for wireless sensor networks,’’ IEEE Trans. Comput., vol. 62,no. 1, pp. 151–162, Jan. 2013.

[23] C. Tunca, S. Isik, M. Y. Donmez, and C. Ersoy, ‘‘Distributed mobile sinkrouting for wireless sensor networks: A survey,’’ IEEE Commun. SurveysTuts., vol. 16, no. 2 pp. 877–897, May 2014

[24] R. Zhu, Y. Qin, and J. Wang, ‘‘Energy-aware distributed intelligent datagathering algorithm in wireless sensor networks,’’ Int. J. Distrib. SensorNetw., vol. 7, no. 4, pp. 272–280, 2011.

[25] A. K. Kumar, K. M. Sivalingam, and A. Kumar, ‘‘On reducing delay inmobile data collection based wireless sensor networks,’’ Wireless Netw.,vol. 19, no. 3, pp. 285–299, 2013.

[26] J. Wang, Z. Zhang, J. Shen, F. Xia, and S. Lee, ‘‘An improved stable elec-tion based routing protocol withmobile sink for wireless sensor networks,’’in Proc. IEEE Int. Conf. Green Comput. Commun., IEEE Internet ThingsIEEE Cyber, Phys. Soc. Comput., Aug. 2013, pp. 945–950.

[27] S. Oh, Y. Yim, J. Lee, H. Park, and S.-H. Kim, ‘‘Non-geographical shortestpath data dissemination for mobile sinks in wireless sensor networks,’’ inProc. IEEE Veh. Technol. Conf. (VTC Fall), Sep. 2011, pp. 1–5.

[28] M. Youssef, A. Youssef, andM. Younis, ‘‘Overlapping multihop clusteringfor wireless sensor networks,’’ IEEE Trans. Parallel Distrib. Syst., vol. 20,no. 12, pp. 1844–1856, Dec. 2009.

[29] A. W. Khan, A. H. Abdullah, M. A. Razzaque, and J. I. Bangash,‘‘VGDRA: A virtual grid-based dynamic routes adjustment scheme formobile sink-based wireless sensor networks,’’ IEEE Sensors J., vol. 15,no. 1, pp. 526–534, Jan. 2015.

[30] S. Mottaghia andM. R. Zahabi, ‘‘Optimizing LEACH clustering algorithmwith mobile sink and rendezvous nodes,’’ AEU-Int. J. Electron. Commun.,vol. 69, no. 2, pp. 507–514, 2015.

[31] C.-Y. Chang, C.-Y. Lin, and C.-H. Kuo, ‘‘EBDC: An energy-balanced datacollection mechanism using a mobile data collector in WSNs,’’ Sensors,vol. 12, no. 5, pp. 5850–5871, 2012.

[32] N. Wang, X.-G. Qi, L. Duan, H. Jiang, and X. Liu, ‘‘Clustering-basedrouting algorithm in wireless sensor networks with mobile sink,’’ J. Netw.,vol. 9, no. 9, pp. 2376–2383, Sep. 2014.

[33] A. Kinalis, S. Nikoletseas, D. Patroumpa, and J. Rolim, ‘‘Biased sinkmobility with adaptive stop times for low latency data collection in sensornetworks,’’ Inf. Fusion, vol. 15, no. 1, pp. 56–63, Jan. 2014.

[34] G. Han, J. Chao, C. Zhang, L. Shu, and Q. Li, ‘‘The impacts of mobilitymodels on DV-hop based localization in mobile wireless sensor networks,’’J. Netw. Comput. Appl., vol. 42, no. 6, pp. 70–79, Jun. 2014.

[35] G. Han, H. Xu, T. Q. Duong, J. Jiang, and T. Hara, ‘‘Localization algo-rithms of wireless sensor networks: A survey,’’ Telecommun. Syst., vol. 52,no. 4, pp. 2419–2436, 2013.

CHUAN ZHU received the Ph.D. degree from theDepartment of Computer Science, NortheasternUniversity, Shenyang, China, in 2009. He iscurrently a Lecturer with the Department ofInformation and Communication System, HohaiUniversity, China. He holds a post-doctoral posi-tion with Hohai University. He has authored overten papers in related international conferences andjournals. His current research interests are sensornetworks, cloud computing, and computer

networks.

VOLUME 3, 2015 395


SHUAI WU is currently pursuing the master’sdegree from the Department of Information andCommunication System, Hohai University, China.His current research interests are wireless sensornetworks and computer networks.

GUANGJIE HAN (S’03–M’05) received thePh.D. degree from the Department of ComputerScience, Northeastern University, Shenyang,China, in 2004. He was with ZTE Company from2004 to 2006, where he was the Product Manager.He held a post-doctoral position with the Depart-ment of Computer Science, Chonnam NationalUniversity, Korea, until 2008. He was a VisitingResearch Scholar with Osaka University from2010 to 2011. He is currently a Professor with the

Department of Information and Communication System, Hohai University,China. He has authored over 130 papers in related international conferencesand journals. He holds 65 patents. His current research interests are sensornetworks, computer communications, mobile cloud computing, multimediacommunication, and security. He is a member of the Association for Com-putingMachinery. He was a recipient of the ComManTel 2014, ComComAP2014, and Chinacom 2014 Best Paper Awards. He has served on the EditorialBoard of up to 14 international journals, including the Journal of InternetTechnology, the International Journal of Ad Hoc and Ubiquitous Computing,and KSII Transactions on Internet and Information Systems. He has servedas the Co-Chair for over 30 international conferences/workshops, anda TPC Member for over 100 conferences. He has served as a Reviewer forover 50 journals.

LEI SHU (M’07) received the Ph.D. degreefrom the National University of Ireland, Galway,Ireland, in 2010. Until 2012, he was a SpeciallyAssigned Researcher with the Department ofMultimedia Engineering, Graduate School ofInformation Science and Technology, Osaka Uni-versity, Japan. Since 2012, he has been with theGuangdong University of Petrochemical Tech-nology, Maoming, China, as a Full Professor.Since 2013, he has been a Ph.D. Supervisor

with the Dalian University of Technology, Dalian, China, and a master’sSupervisor with the Beijing University of Posts and Telecommunications,Beijing, China. He has also been the Vice Director of the GuangdongProvincial Key Laboratory of Petrochemical Equipment Fault Diagnosiswith the Guangdong University of Petrochemical Technology. He is cur-rently the Founder of the Industrial Security and Wireless Sensor Net-works Laboratory. He has authored over 200 papers in related confer-ence proceedings, journals, and books. He has an h-index of 18. Hisresearch interests include wireless sensor networks, multimedia commu-nication, middleware, security, and fault diagnosis. He is a member ofthe IEEE Communication Society, the European Alliance for Innovation,and the Association for Computing Machinery. He was a recipient of the2010 IEEE Global Communications Conference and 2013 IEEE Inter-national Conference on Communications Best Paper Awards. He servedas the Co-Chair for over 50 various international conferences/workshops,e.g., the IEEE International Wireless Communications and Mobile Com-puting Conference (IWCMC), the IEEE International Conference onCommunications (ICC), the IEEE Symposium on Computers andCommunications (ISCC), the IEEE International Conference on Computing,Networking and Communication, and the International Conferenceon Communications and Networking in China (Chinacom). He alsoserved/will serve as the Symposium Co-Chair of IWCMC 2012and ICC 2012, the General Chair of Chinacom 2014 and theInternational Conference on Heterogeneous Networking for Quality,Reliability, Security, and Robustness 2015, the Steering Chair ofthe International Conference on Industrial Networks and IntelligentSystems 2015, and a Technical Program Committee Member for over150 conferences, including the IEEE International Conference onDistributedComputing in Sensor Systems, the IEEE International Conference onMobileAd hoc and Sensor Systems, ICC, Globecom, the IEEE International Con-ference on Computer Communications and Networks, the IEEE WirelessCommunications and Networking Conference, and ISCC. He serves as theEditor-in-Chief of the European Alliance for Innovation Endorsed Transac-tions on Industrial Networks and Intelligent Systems, and an Associate Editorof a number of renowned international journals.

HONGYI WU (M’07) received the B.S. degree inscientific instruments from Zhejiang University,Hangzhou, China, in 1996, and the M.S. degreein electrical engineering and the Ph.D. degree incomputer science from the State University of NewYork at Buffalo, in 2000 and 2002, respectively.He has been with the Center for Advanced Com-puter Studies, University of Louisiana at Lafayette(UL Lafayette), where he is currently a Professorand holds the Alfred and Helen Lamson Endowed

Professorship in Computer Science. His research spans delay-tolerantnetworks, radio frequency identification systems, wireless sensor networks,and integrated heterogeneous wireless systems. He was a recipient of theNSFCAREERAward in 2004, and theULLafayette Distinguished ProfessorAward in 2011.

396 VOLUME 3, 2015

A Tree-Cluster-Based Data-Gathering Algorithm for Industrial WSNs ...

Documents

Transcript of A Tree-Cluster-Based Data-Gathering Algorithm for Industrial WSNs ...