300 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 16, NO. …

14
A Weighted Crowdsourcing Approach for Network Quality Measurement in Cellular Data Networks Yaliang Li, Jing Gao, Member, IEEE, Patrick P. C. Lee, Member, IEEE, Lu Su, Member, IEEE, Caifeng He, Cheng He, Fan Yang, and Wei Fan Abstract—With ubiquitous smartphone usages, it is important for network providers to provide high-quality service to every user in the network. To make more effective planning and scheduling, network providers need an accurate estimate of network quality for base stations and cells from the perspective of user experience. Traditional drive testing approach provides a quality measurement for each area and the quality measurement is obtained from the equipment in a moving vehicle. This approach suffers from the limitations of high costs, low coverage, and out-of-date values. In this paper, we propose a novel crowdsourcing approach for the task of network quality estimation, which incurs little costs and provides timely and accurate quality estimation. The proposed approach collects quality measurements from individual end users within a certain network or cell coverage area, and then aggregates these measurements to obtain a global measurement of network quality. We propose an effective aggregation scheme which infers the information weights of end users and incorporates such weights into the estimation of network quality. Experiments are conducted on two datasets collected from citywide 3G networks, which involve 616; 796 users and 22; 715 cells. We validate the effectiveness of the proposed approach compared with baseline method. From the aggregated measurement results, we observe some interesting patterns about network quality, which can be explained by network usage and traffic behavior. We also show that proposed approach runs in linear time. Index Terms—Crowdsourcing, network quality measurement, cellular network Ç 1 INTRODUCTION N OWADAYS, smartphones are playing indispensable roles in people’s lives. People use them to communicate and share items with friends through social networks, watch real-time videos, send and receive emails, etc. With the explosion of smartphone users, third generation (3G) cellular networks have been deployed widely over the world, and the transition to the fourth generation (4G) has started. The ultimate goal of network providers is to provide high-quality network experience to all the users. One key factor in provid- ing high-quality network service is the ability of accurately measuring network quality in real-time so that actions can be taken immediately when network quality drops. In this paper, we study the problem of measuring net- work quality for network providers. Specifically, we hope to provide the quality measurement of a base station or a cell. From the perspective of users, the network quality may vary within the range of a base station or a cell. However, network providers need a single indicator to represent the network quality of a base station or a cell. Once such indica- tor results are available, network providers could use such information to answer important questions in planning and scheduling: Which area should have more base stations? How to decide and optimize the choice of cells for users across the entire network to balance loads on different cells? How to plan in advance to avoid bad network quality and heavy loads based on historical data of network quality? The traditional way used in practice to assess network quality is “drive testing” [1], in which technicians drive a vehicle carrying the equipment to measure various parame- ters and derive the quality of networks in a particular area. This approach may consume extensive resources and human labors. Since it takes time and efforts to drive around, this approach cannot cover a large area at the same time or adapt to the change of network quality in real-time. To overcome these limitations, we propose a novel crowdsourcing approach to derive accurate and reliable estimates of network quality for a given area by aggregating information from a crowd of users. Specifically, network pro- viders can easily collect network quality measurement from each user [2], [3]. For a given area, we can aggregate measurements obtained from all the users within this area and output a final estimate of the area network quality. Compared with the traditional drive testing approach, our proposed crowdsourcing approach has the following advantages: (1) The cost of collecting measurement data from users is much smaller than that incurred in drive Y. Li, J. Gao, and L. Su are with the University at Buffalo, The State University of New York, 338 Davis Hall, Buffalo, NY 14260. E-mail: {yaliangl, jing, lusu}@buffalo.edu. P. P. C. Lee is with the Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong. E-mail: [email protected]. C. He and C. He are with the Huawei Noahs Ark Lab, 525-520, 5/F, Core Building 2, Hong Kong Science Park, Shatin, Hong Kong. E-mail: {hecaifeng, hecheng}@huawei.com. F. Yang is with the Tencent Holdings Limited, Shenzhen, Guangdong, China. E-mail: fi[email protected]. W. Fan is with the Baidu Research Big Data Lab, 1195 Bordeaux Drive, Sunnyvale, CA 94089. E-mail: [email protected]. Manuscript received 21 Apr. 2015; revised 27 Feb. 2016; accepted 15 Mar. 2016. Date of publication 25 Mar. 2016; date of current version 5 Jan. 2017. For information on obtaining reprints of this article, please send e-mail to: [email protected], and reference the Digital Object Identifier below. Digital Object Identifier no. 10.1109/TMC.2016.2546900 300 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 16, NO. 2, FEBRUARY 2017 1536-1233 ß 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Transcript of 300 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 16, NO. …

AWeighted Crowdsourcing Approachfor Network Quality Measurement

in Cellular Data NetworksYaliang Li, Jing Gao,Member, IEEE, Patrick P. C. Lee,Member, IEEE,

Lu Su,Member, IEEE, Caifeng He, Cheng He, Fan Yang, and Wei Fan

Abstract—With ubiquitous smartphone usages, it is important for network providers to provide high-quality service to every user in the

network. To make more effective planning and scheduling, network providers need an accurate estimate of network quality for base

stations and cells from the perspective of user experience. Traditional drive testing approach provides a quality measurement for each

area and the quality measurement is obtained from the equipment in a moving vehicle. This approach suffers from the limitations of

high costs, low coverage, and out-of-date values. In this paper, we propose a novel crowdsourcing approach for the task of network

quality estimation, which incurs little costs and provides timely and accurate quality estimation. The proposed approach collects quality

measurements from individual end users within a certain network or cell coverage area, and then aggregates these measurements to

obtain a global measurement of network quality. We propose an effective aggregation scheme which infers the information weights of

end users and incorporates such weights into the estimation of network quality. Experiments are conducted on two datasets collected

from citywide 3G networks, which involve 616; 796 users and 22; 715 cells. We validate the effectiveness of the proposed approach

compared with baseline method. From the aggregated measurement results, we observe some interesting patterns about network

quality, which can be explained by network usage and traffic behavior. We also show that proposed approach runs in linear time.

Index Terms—Crowdsourcing, network quality measurement, cellular network

Ç

1 INTRODUCTION

NOWADAYS, smartphones are playing indispensableroles in people’s lives. People use them to communicate

and share items with friends through social networks, watchreal-time videos, send and receive emails, etc. With theexplosion of smartphone users, third generation (3G) cellularnetworks have been deployed widely over the world, andthe transition to the fourth generation (4G) has started. Theultimate goal of network providers is to provide high-qualitynetwork experience to all the users. One key factor in provid-ing high-quality network service is the ability of accuratelymeasuring network quality in real-time so that actions can betaken immediately when network quality drops.

In this paper, we study the problem of measuring net-work quality for network providers. Specifically, we hopeto provide the quality measurement of a base station or a

cell. From the perspective of users, the network quality mayvary within the range of a base station or a cell. However,network providers need a single indicator to represent thenetwork quality of a base station or a cell. Once such indica-tor results are available, network providers could use suchinformation to answer important questions in planning andscheduling: Which area should have more base stations?How to decide and optimize the choice of cells for usersacross the entire network to balance loads on different cells?How to plan in advance to avoid bad network quality andheavy loads based on historical data of network quality?

The traditional way used in practice to assess networkquality is “drive testing” [1], in which technicians drive avehicle carrying the equipment to measure various parame-ters and derive the quality of networks in a particular area.This approach may consume extensive resources andhuman labors. Since it takes time and efforts to drivearound, this approach cannot cover a large area at the sametime or adapt to the change of network quality in real-time.

To overcome these limitations, we propose a novelcrowdsourcing approach to derive accurate and reliableestimates of network quality for a given area by aggregatinginformation from a crowd of users. Specifically, network pro-viders can easily collect network quality measurement fromeach user [2], [3]. For a given area, we can aggregatemeasurements obtained from all the users within this areaand output a final estimate of the area network quality.Compared with the traditional drive testing approach,our proposed crowdsourcing approach has the followingadvantages: (1) The cost of collecting measurement datafrom users is much smaller than that incurred in drive

� Y. Li, J. Gao, and L. Su are with the University at Buffalo, The StateUniversity of New York, 338 Davis Hall, Buffalo, NY 14260.E-mail: {yaliangl, jing, lusu}@buffalo.edu.

� P. P. C. Lee is with the Department of Computer Science and Engineering,The Chinese University of Hong Kong, Shatin, Hong Kong.E-mail: [email protected].

� C. He and C. He are with the Huawei Noahs Ark Lab, 525-520, 5/F, CoreBuilding 2, Hong Kong Science Park, Shatin, Hong Kong.E-mail: {hecaifeng, hecheng}@huawei.com.

� F. Yang is with the Tencent Holdings Limited, Shenzhen, Guangdong,China. E-mail: [email protected].

� W. Fan is with the Baidu Research Big Data Lab, 1195 Bordeaux Drive,Sunnyvale, CA 94089. E-mail: [email protected].

Manuscript received 21 Apr. 2015; revised 27 Feb. 2016; accepted 15 Mar.2016. Date of publication 25 Mar. 2016; date of current version 5 Jan. 2017.For information on obtaining reprints of this article, please send e-mail to:[email protected], and reference the Digital Object Identifier below.Digital Object Identifier no. 10.1109/TMC.2016.2546900

300 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 16, NO. 2, FEBRUARY 2017

1536-1233� 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

testing. (2) As users span all geographical areas, the pro-posed approach can easily cover all the areas simulta-neously in network quality estimation. (3) By controllingthe frequency of information collection, we can providereal-time network quality estimates and monitor networkquality changes timely. (4) Besides providing network qual-ity estimates of an area, the proposed approach can also out-put user usage patterns that cannot be obtained by drivetesting. Such patterns are invaluable for network providersfor future planning and quality control.

The major technical challenge in the crowdsourcingapproach is how to conduct effective aggregation amongdiverse quality measurements collected from a largeamount of users. The naive way to aggregate the networkquality information from users is to conduct voting, inwhich the measurement value that is claimed by the major-ity of users is selected as output. The drawback of this sim-ple aggregation is that they treat all the user measurementinformation equally. This may not be effective in real prac-tice as the collected information can be affected by varioususer-related factors. For example, the following user-relatedfactors may lead to the degradation of network quality: badhardware quality in some users’ devices, frequent transi-tions between cells resulted by users’ highly mobile movingpatterns, some personal habits such as constantly refreshingwebsites or Apps that result in a high volume of webrequests. All of these user-related factors affect their userexperience and the information collected from them. Thuswhen aggregating measurement information from varioususers, it is important to treat different users differently by con-sidering the effect of these user-related factors.

Therefore, instead of conducting voting without distin-guishing users, we should adopt a weighted aggregationscheme which takes user-related factors into consider-ation when aggregating network quality measurementscollected from different users. Here the user weight indi-cates how much information from this particular usershould be taken into consideration for the aggregationafter removing user-related factors. If a user has a bigbias in his measurements (e.g., his measurements areheavily affected by user-related factors), the correspond-ing weight will be low, while a high weight indicates thecorresponding user has a small bias in his measurements.Unfortunately, user-related factors are not easy to quan-tify, thus user weights are not known a priori.

To tackle this challenge, we propose an effective way toestimate user weights and aggregate network quality simul-taneously based on the following principle: Informationfrom a user with a high weight should be counted more inthe aggregation to derive the aggregated quality of network,and a user whose provided information is close to theaggregated result should have a high weight. An optimiza-tion function is defined based on this principle to aggregateuser measurements for better estimation of network quality.In this optimization framework, user weights and aggre-gated measurements are defined as two sets of unknownvariables, and they are jointly estimated to optimize theobjective function. By iteratively updating one set of varia-bles based on the other set, the estimation on user weightsand aggregated measurements are mutually enhanced untilconvergence is reached.

To evaluate the proposed crowdsourcing aggregationapproach for assessing cellular network quality, we collectdata from two citywide 3G networks, which totally involve616; 796 users and 22; 715 cells. Data are collected during aneight or four days period frommeasurement report log files,IP packet traces, performance call history and record logfiles. From these files, we extract a collection of network qual-ity measurements from individual users about the cells theyare using. We experimentally examine the variety of userweights on these two datasets to verify the benefit of the pro-posed weighted aggregation scheme. The effectiveness ofweighted aggregation via user weight estimation is demon-strated by comparing with majority voting baseline on simu-lations and case studies. Then we derive network qualityusing the proposed aggregation scheme on these two data-sets, and illustrate various patterns in network quality basedon the aggregation results. These patterns are furtherexplained by correlating with network usage and trafficbehavior. Moreover, we show that the proposed aggregationapproach is efficient and linear in running time, and thus canbe deployed in large-scale network for real-time scenario.

To sum up, in this paper, we propose to deliver an effec-tive solution to the network quality measurement task. Weprovide an end-to-end solution to network providers byextracting measurements from users’ data and aggregatingtheir measurements. More specifically, the contributions ofthis paper are:

� We present a novel, effective and cost-efficientcrowdsourcing approach to measure network qual-ity as an alternative to the drive testing approach. Bywisely utilizing crowd information, the proposedapproach can provide timely and accurate estimateof network quality, which is important for manyplanning and scheduling tasks of network providers.

� Instead of applying simple voting, we propose tointegrate user measurements by a weighted combi-nation scheme in which weights capture user-relatedfactors. We propose an optimization method tojointly estimate user weights and aggregated net-work quality. By taking into account user weights,the aggregated measurements are more precise.

� We conduct thorough experimental analysis on tworeal world datasets involving 616; 796 users and22; 715 cells, which are collected recently from twocitywide 3G operation networks. The results demon-strate the effectiveness of the proposed weightedaggregation scheme.

� We analyze the time complexity of the proposedmethod, and experimentally confirm that it runs inlinear time with respect to the number of observa-tions. We also implement the parallel version of theproposed method based on Hadoop, which makesthe proposed method suitable for large-scale real-time quality monitoring.

2 METHODOLOGY

In this section, we first formally define the task of aggregat-ing user measurement for network quality estimation. Amajority voting baseline method is then described. Afterthat, we present the proposed aggregation method, which

LI ETAL.: A WEIGHTED CROWDSOURCING APPROACH FOR NETWORK QUALITY MEASUREMENT IN CELLULAR DATA NETWORKS 301

conducts weighted voting via user weight estimation. Weformulate the task as an optimization problem by consider-ing the effect of user-related factors. Finally, we discusssome practical issues to adapt the proposed model to differ-ent task scenarios and platform settings.

2.1 Problem Formulation

To formally define the problem, we start by introducingsome important terms:

� object: An object is the target whose network qualitywe want to measure. It could be a radio network con-troller, a base station, or a cell, each of which cancover a set of users. In this task, we focus on cells. Asnetwork quality and users in the cell are both chang-ing over time, we are interested in monitoring thestatus of network quality for an object, which isdefined as the network quality of an object at a certaintime. For example, “a cell with ID 37601 at 19 : 28pm,21=07=2013” denotes a status whose network qualityis to be evaluated. We represent a status using vari-able o, which corresponds to an object at a certaintime. We denote a set of statuses as O ¼ fo1; o2; . . . ;oNg, which contains a set of objects monitored at dif-ferent time stamps.

� user: A user is an end user who is supported by thenetwork, which is denoted as u. Each status o has agroup of users, denoted by Uo, which is the set ofusers who are using the object at a particular time.We define the set of users for all statuses as

U ¼ [Ni¼1Uoi .

� claim: A user u makes a claim about a status o whenthe user is supported by the object’s network at aparticular time. We collect the measurement infor-mation about this network’s quality from this user.The measurement value is denoted as vuo . As therequirement from network provider, in the followingwe assume the network quality measurements arediscrete, that is, vuo represents a value chosen frompredefined set S ¼ f“good”; “medium”; “bad”g. Wewill discuss more details about this in Section 3.

� value: For a status o, each user u in the set Uo pro-vides a measurement value vuo to indicate networkquality, and we combine the information from all theusers in Uo to get an aggregated value v�o as the finaloutput. In other words, v�o is the desired measure-ment of the network quality at status o.

� weight: Each user u 2 U has a weight tu, which indi-cates how much of the information from this particu-lar user should be taken into consideration foraggregation. A low user weight suggests that theinformation from this user is highly affected by user-related factors and we should not rely on the infor-mation from this user when we calculate the aggre-gated value, while the information from a user witha higher weight will be counted more in the aggrega-tion as this user has small bias (e.g., user-related fac-tors) in his measurement.

To sum up, in this task, suppose there areN statuses to bemonitored, each of which has a corresponding set of users Uo.From each user u in the set, a network quality measurement

value vuo is collected for the status o. The objective is to inte-grate measurement values fvuogu2Uo among users, and get an

aggregatedmeasurement v�o for each status o.

2.2 Baseline Method

A naive solution for this aggregation task is to find the valuethat is claimed by the majority of the users. That is, for eachstatus o, the value v 2 S that has the highest count amongusers is selected as the final output v�o:

v�o ¼ argmaxv2S

Xu2Uo

1ðvuo ¼ vÞ: (1)

Although this approach provides an efficient way to aggre-gate user measurement values, it assumes that informationfrom different users are equally important. It doesn’t con-sider the effect of user-related factors and doesn’t distin-guish the difference among various users. However, asdiscussed in Section 1 and justified in Section 5.1, the qualitymeasurements are affected by user-related factors and usersshould be distinguished.

2.3 Weighted Aggregation

It is more reasonable to conduct weighted aggregation ofnetwork quality measurements by considering the effectof user-related factors. Suppose the weight of each user,denoted as tu, is available. We can incorporate such weightsinto the aggregation usingweighted combination. The aggre-gatedmeasurement v�o for each status o is calculated as:

v�o ¼ argmaxv2S

Xu2Uo

tu � 1ðvuo ¼ vÞ: (2)

In this weighted voting scheme, we trust more the informa-tion vuo that is claimed by the users who have higherweights. In Section 5.2, our experimental results show thatthe weighted voting scheme outputs more reasonable mea-surement values compared with simple majority voting.

As weighted aggregation is a preferred aggregationscheme, the key is to derive users’ weights to be incorpo-rated in the aggregation. However, user-related factors aretypically unknown, and when the aggregated network qual-ity is unknown neither, the user weights cannot be easilyestimated. To tackle this challenge, we propose a novelapproach to estimate both user weights and network qualityjointly. The basic principle is inspired by our previous work[4] : Information from a user with a high weight should becounted more in the aggregation, and a user whose pro-vided information is close to the aggregated result shouldhave a high weight. Based on this principle, we design amethod that tightly integrates the process of network qual-ity aggregation and user weight estimation. Specifically, weformulate the task as the following optimization problem:

minfv�og;ftug

Xo2O

Xu2Uo

tu � disðvuo ; v�oÞ

s.t. tu � 0; hðftugÞ ¼ 1;

(3)

where disð�Þ is a distance function that measures the differ-ence between two measurement values, and 0-1 loss func-tion is chosen here: If vuo is the same as v�o, the distance is 0;otherwise, the distance is 1. And hð�Þ is a constraint function

302 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 16, NO. 2, FEBRUARY 2017

to regularize the users’ weights. As users have variousweights, we set the constraint function to be:

hðftugÞ ¼Xu2U

expf�tug ¼ 1: (4)

We choose this constraint function in order to derive mean-ingful user weight distribution, which is discussed later inthis section.

The intuition behind this optimization formulation is asfollows. Since users with high weights provide more accu-rate values of network quality measurement, the aggregatedmeasurement value should be closer to the claimed valuesfrom these users. If the aggregated measurement v�o is farfrom the claimed values vuo from high-weight users, the losswill be high because the distance between v�o and vuo isweighted by user weight tu. Meanwhile, we allow that theaggregated measurement v�o to be different from the claimedvalues vuo from users with a lowweight tu. Byminimizing theoverall distance, the aggregated measurement will rely onthe users with highweights and their provided information.

From Eq. (3), we can see that there are two sets ofunknown variables, i.e., fv�og and ftug. Therefore, a naturalway to solve this optimization problem is to use block coor-dinate descent techniques [5]. Note that although blockcoordinate descent technique is widely adopted for optimi-zation problems, our way of formulation (i.e., objectivefunction and constraints) on the practical network qualitymeasurement task is unique. Specifically, we propose atwo-step procedure, which iteratively updates one set ofvariables to minimize the objective function while keepingthe other set of variables unchanged. Here are the two steps:

� Aggregation Step: At this step, we fix the weight foreach user and assume that ftug are known. With theknown user weights, we aggregate the measurementvalues for each status as follows:

v�o argminv2S

Xu2Uo

tu � disðvuo ; vÞ: (5)

This equation is equivalent to Eq. (2). Once we calcu-late the aggregated measurement for each object asfv�og , the overall objective function is minimized asftug are fixed.

� Estimation Step: At this step, we fix the aggregatedvalue for each status and update weights for usersby minimizing the following function:

ftug argminftug

Xo2O

Xu2Uo

tu � disðvuo ; v�oÞ

s.t.Xu2U

expf�tug ¼ 1:(6)

The closed-form solution for Eq (6) is:

tu ¼ �logP

o:u2Uo disðvuo ; v�oÞPu02U

Po:u02Uo disðvu

0o ; v

�oÞ

!: (7)

According to this update equation, theweight of a par-ticular user is the negative logarithm of normalizeddistance, which is measured as the distance of theuser’s measurement to the aggregated measurement

divided by the total distance among all users. Thismatches our intuition that a userwhose providedmea-surement values are closer to the aggregated valueshould be assigned a higher weight. The logarithm isused to re-scale the userweights to a reasonable range.

The pseudo code of this procedure is summarized inAlgorithm 1. We initialize user weights uniformly, and theniteratively conduct the aggregation step and estimation stepuntil convergence. We will give the detailed proof of con-vergence in the following. In practice, we implement theconvergence criterion by judging whether the decrease inthe objective function is small enough compared with theprevious iterations. In the experiments, we find that the con-vergence of this approach is easy to judge because the firstseveral iterations incur a huge decrease in the objectivefunction, and once it converges, the results become stable.

Algorithm 1

Input: Claimed values from users for statuses fvuo ju 2 UogNo¼1Output: Aggregated values fv�ogNo¼1 and user weights ftugu2U1: Initialize user weights ftu ¼ 1gu2U ;2: repeat3: for o 1 to N do4: Aggregation step: calculate the aggregated measure-

ment v�o for status o according to Eq. (5) based on thecurrent estimation of user weights;

5: end for6: for each u 2 U do7: Estimation step: update users weights ftug according to

Eq. (7) based on the current aggregatedmeasurement;8: end for9: until Convergence criterion is satisfied10: return fv�ogNo¼1 and ftugu2U

2.4 Time Complexity Analysis

Here we analyze the time complexity of the proposedmethod. Assume totally there are C claims about N objectsfrom jUj users. Note that as some users may not provideobservations about some objects, C � N � jUj. At aggregationstep, weighted aggregation is conducted for each objectamong its corresponding claims, and thuswe needOðCÞ timeto compute the aggregated measurements for all the objects.At estimation step, we need to scan all the claims to calculatethe error between the claimed value and the aggregatedvalue, so the estimation step also takes OðCÞ time. Thus, foreach iteration, the time complexity is OðCÞ. In next section,we will prove the convergence of the proposed method,which means that the number of iteration can be regarded asa constant. Therefore, the time complexity of the proposedmethod isOðCÞ, i.e., the running time of the proposedmethodis linear with respect to the number of observations. InSection 5.4, wewill experimentally confirm this analysis.

2.5 Convergence Analysis

Now we prove the convergence of the proposed weightedvoting approach by the following theorem.

Theorem 1. When 0-1 loss function is used as loss function andEq. (4) is used as constraint, the convergence of the proposedweighted voting approach is guaranteed.

LI ETAL.: A WEIGHTED CROWDSOURCING APPROACH FOR NETWORK QUALITY MEASUREMENT IN CELLULAR DATA NETWORKS 303

Proof. According to the proposition on the convergence ofblock coordinate descent [5], if unique minimum isachieved by optimizing the objective function withrespect to one set of variables during each iteration, theiterative procedure leads to a stationary point. Therefore,to prove the convergence of the proposed approach forEq. (3), we only need to show that unique minimum isachieved with respect to user weights or the aggregatedvalues during aggregation or estimation step.

Let’s first look at the aggregation step. At this step, theweights are fixed, and thus Eq. (3) is a weighted combi-nation of all the distance between each input and theaggregated value. Clearly, v�o should be the value thatreceives the highest weighted votes among all possiblevalues:

v�o argminv2S

Xu2Uo

tu � 1ðvuo ; vÞ; (8)

where 1ðx; yÞ ¼ 1 if x ¼ y, and 0 otherwise. If we chooseany other values for v�o, the objective function incurs alarger value.

At the estimation step, the aggregated values arefixed, and we will prove that the following solution foruser weights gives unique minimum of the optimizationproblem Eq. (3) with constraint Eq. (4):

tu ¼ �logP

o:u2Uo disðvuo ; v�oÞPu02U

Po:u02Uo disðvu

0o ; v

�oÞ

!: (9)

Since the aggregated values are fixed, the optimizationproblem Eq. (3) has only one set of variables ftug. Wefirst prove that Eq. (3) is convex. We introduce anothervariable au so that au ¼ expð�tuÞ. Now we express theoptimization problem in terms of au:

minfaug

Xo2O

Xu2Uo�log ðauÞ � disðvuo ; v�oÞ

s.t.Xu2U

au ¼ 1:(10)

The constraint in Eq. (10) is linear in au, which is affine.The objective function is a linear combination of negativelogarithm functions and thus it is convex. Therefore, theoptimization problem Eq. (3) with Eq. (4) is convex, andany local optimum is also global optimum [6].

We use the method of Lagrange multipliers to solvethis optimization problem. The Lagrangian of Eq. (10) isgiven as:

Lðfaug; �Þ ¼Xo2O

Xu2Uo�log ðauÞ � disðvuo ; v�oÞ

þ � �Xu2U

au � 1

!;

(11)

where � is a Lagrange multiplier. Let the partial deriva-tive of Lagrangian with respect to au be 0, and we can get:X

o:u2Uodisðvuo ; v�oÞ ¼ � � au: (12)

From the constraint thatP

u2U au ¼ 1, we can derive that

� ¼Xu02U

Xo:u02Uo

disðvu0o ; v�oÞ: (13)

Plugging Eq. (13) and tu ¼ �log ðauÞ into Eq. (12), weobtain Eq. (9). tu

2.6 Practical Issues

The proposed aggregation algorithm can be used to derivean aggregated measurement value for network quality andthe presentation of the algorithm is based on a specific sce-nario. In fact, it can be applied or easily extended to fit vari-ous scenarios and take advantages of different computingplatforms. Here, we discuss some practical issues whenextending the approach to these scenarios.

� Data format: In previous discussion, we assume thatthe measurement value from each individual user isonly one value chosen from the predefined setS ¼ f“good”; “medium”; “bad”g. In fact, the proposedframework is not restricted to this particular set,instead, the method works well on any set. If the net-work provider would like to see more fine-grainedpredefined values, the proposed method can be eas-ily adapted to handle the new set.

� User weight assignment: The constraint functionassigns various distributions of weights amongusers. According to the requirement of applicationtasks, other strategies can be adopted to set the con-straint function. For example, if user selection isdesirable, we can force tu 2 f0; 1g, and use the corre-

sponding constraint hðftugÞ ¼ 1j

Pu2U tu ¼ 1. This

strategy will select j users and eliminate the otherusers’ information from the aggregation process.

� User weight initialization: In Algorithm 1, we initializeuser weights with uniform values. However, if wehave any prior knowledge or external knowledgeabout users, such information can be incorporatedinto the setting of the initial user weights.

� Online version: In the previous discussion, we discussour proposed method in an offline setting, but it canbe easily modified into online or incremental settingas described below. Suppose we have data continu-ously flowing in, i.e., data streams, and they arrive insequential chunks. At each time stamp, based on thecurrent estimated weights, we calculate the aggre-gated measurements for newly emerging statuses.Based on these aggregated measurements, for eachuser, we calculate their loss on these new statuses,and then we incrementally update tu according tothe following equation:

tu ¼ �logbu þ

Po:u2Uo disðvuo ; v�oÞP

u02U bu0 þP

o:u02Uo disðvu0

o ; v�oÞ

!; (14)

where bu is used to record the total loss for user ufrom the beginning to current, and it will be updatedeach time to include the new loss.

� Parallel platform: The number of involved usersand statuses could be huge, and usually network

304 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 16, NO. 2, FEBRUARY 2017

providers have powerful parallel computation plat-form. Our proposed method can take this advantageto speed up the process. Here, we briefly discusshow to fit our method into parallel computing plat-forms such as MapReduce. It is obvious that theaggregation step can be executed independently foreach status to parallelize. The estimation step is insummation form, and according to [7], it can be par-allelized by aggregating partial sums. Overall, theproposed method is easy to parallelize.

3 DATA COLLECTION

In previous section, our proposed method is presented, andseveral practical issues are also discussed to fit variousapplication scenarios. In the following sections, we willdescribe the procedure of collecting and extracting usermeasurement of network quality as the input to the pro-posed aggregation method.

We collect three types of raw data including Measure-ment Report (MR) log files, IP packet traces, PerformanceCall History and Record (PCHR) log files, from the real Uni-versal Mobile Telecommunication System (UMTS) network,which is one of the most popular 3G mobile cellular net-work deployed nowadays. Meanwhile, we maintain anIMEI (International Mobile station Equipment Identity)database which can help us to establish the mapping fromthe type allocation code for a device to the correspondingdevice type, including the hardware model and operatingsystem. All these datasets are collected during the sameperiod and present different features of the network. Thenetwork quality information from smartphone users willbe derived from these log files, which are automaticallycollected by the network provider from users when they areusing the network. During this process, the users getinvolved passively, and they do not need to take any extraeffort to contribute their measurement data.

MR logs periodically record the user-level network signalquality in terms of Received Signal Code Power (RSCP) andEc/No, which are both standard measurement in 3G net-work. Generally, RSCP and Ec/No represent the signalpower and signal intensity received by the user in the net-work at that time respectively. As signal quality essentiallyindicates mobile cellular network quality, user-level net-work quality measurement can be obtained by combiningthese two values. According to the requirement from net-work provider, we map the raw values to discrete values

f“good”; “medium”; “bad”g to represent network quality, asthis kind of representation is easy to understand and can befurther adopted for planning and scheduling. To be morespecific, if a user’s RSCP is smaller than threshold �95dBm,it indicates “bad” network quality, otherwise, it indicates“good” network quality. For Ec/No, the same strategy isused with a threshold set at �10dB. Note these thresholdsare determined by domain experience and common practicefrom cellular operators. If both RSCP and Ec/No indicate“good” network quality, the combined measurement is“good”; if both of them are “bad”, the combined one is“bad”; otherwise, it is “medium”. By doing this, for a cell ata certain time, we map each user’s RSCP and Ec/No infor-mation into “good”, “medium”, or “bad” as the measure-ment of this user about the corresponding cell he is using.Note that this mapping from raw values to discrete valuesis decided by the desired granularity that the network pro-vider wants to achieve, and the proposed method can workwith any predefined discrete values. A small sample ofuser-level information and corresponding network qualitymeasurements are illustrated in Table 1. Besides the net-work quality measurements, MR logs provide the locationinformation of each user in every record. In other words,we can get the network quality information from all theusers located at certain places with respect to a cell fromMR logs. We implement the proposed aggregation methodon these datasets to integrate user-level network qualitymeasurement and derive the network quality of each cell.

The IP packet traces, PCHR logs, and IMEI database pro-vide information about the factors that are related to net-work quality. By correlating these three databases, we canobtain information including cell-level traffic patterns, userdevice types, etc. Note that the information extracted here isonly used during result validation. They are not the input tothe aggregation scheme.

We conduct experiments on two real network datasetscollected using the above procedure. The first one is col-lected in City A, from November 25, 2010 to December 2,2010. After pre-processing, it contains 34; 917; 421 entries,which involve 135; 155 users and 10; 421 cells. The secondone is collected in City B, from March 3, 2013 to March 6,2013. This dataset contains 193; 480; 418 entries, in which481; 641 users and 12; 294 cells are involved. Statistics ofthese two datasets are summarized in Table 2. These tworecently collected large-scale datasets provide a goodtestbed for the 3G network quality measurement task.

4 RUNNING EXAMPLE

In this section, we demonstrate how the proposed method-ology works using the example shown in Table 1. In thisexample, we assume that three users are available and theyprovide network quality measurements for five statuses.

TABLE 1Data Sample: User-Level Network Quality Measurements areObtained by Combining Users’ RSCP and Ec/No Information

User ObjectMeasurement Network

RSCP (dBm) Ec/No (dB) Quality

u1 o1 �99.0000 �14.0000 badu2 o1 �93.0000 �12.5000 mediumu3 o1 �98.0000 �8.5000 mediumu1 o2 �103.0000 �12.0000 badu2 o2 �97.0000 �9.0000 mediumu3 o2 �83.0000 �5.5000 good... ... ... ... ...

TABLE 2Statistics of the Collected Datasets

City A City B

number of claims 34,917,421 193,480,418number of users 135,155 481,641number of cells 10,421 12,294

LI ETAL.: A WEIGHTED CROWDSOURCING APPROACH FOR NETWORK QUALITY MEASUREMENT IN CELLULAR DATA NETWORKS 305

The input from users are summarized in Table 3. The goal isto aggregate these user-input measurements to obtain morereasonable quality values for each status.

According to Algorithm 1, we start by assigning equalweights to all users, that is, we set tu1 ¼ tu2 ¼ tu3 ¼ 1:0,where tui is the weight of the user ui. Based on the initial

user weights, we conduct the first round of aggregationstep, which is to calculate each status’s network quality by aweighted voting among user input. Currently all the usershave the same weights, so the results are basically obtainedby majority voting. For example, for the first status o1, thevalue “medium” has 2:0 votes while “bad” only has 1:0votes, so “medium” is selected as the aggregated measure-ment. All the results are illustrated in Table 4.

After the aggregation step, we update users’ weightsbased on the current aggregated measurements. Accordingto Eq. (7), we calculate the difference between each user’provided information and the aggregated measurements,and then re-scale it as users’ new weights. In this runningexample, the updated user weights will be tu1 ¼ 1:25,tu2 ¼ 1:25, and tu3 ¼ 0:84. As can be observed from Table 4,

u3’s input is usually different from the aggregated value soit receives a low weight.

Each round consists of one aggregation step and one esti-mation step. Table 5 shows the result of aggregation stepduring round 2. Similar to the first round, users’ weightsare re-estimated based on the new aggregated measure-ments, and they change to tu1 ¼ 1:94, tu2 ¼ 1:25, andtu3 ¼ 0:55.

At round 3, we continue to conduct aggregation step andestimation step. The result of aggregation step is shown inTable 6, and the weights will be updated as tu1 ¼ 8:98,tu2 ¼ 0:98, and tu3 ¼ 0:47. After this round, we will find that

both the aggregated measurements and the user weightswill not be changed any more. In other words, the algorithmreaches convergence, and thus Table 6 is the final output.

5 EXPERIMENTS

In this section, we report the experimental results that vali-date the proposed crowdsourcing aggregation approach fornetwork quality measurement. In practice, for networkquality measurement task, it is very difficult to identifyclearly defined quality measurement that can be adopted as

the ground truth. This motivates our work but makes evalu-ation difficult. To tackle this challenge, we evaluate the per-formance of the proposed method from the followingperspectives: (1) We first validate our assumption that themeasurement information from users are affected by user-related factors and thus user weight assignment is needed.(2) The effectiveness of the weighted aggregation approachis justified by comparing with majority voting baselinemethod on simulations and case studies. (3) We then evalu-ate the network quality output by the proposed aggregationscheme through showing the quality evolutionary patternswith respect to different factors, such as time, data plan, etc.(4) The proposed approach is shown to be efficient on large-scale datasets and has linear running time.

5.1 User Weight Assignment

The proposed weighted aggregation approach is based onthe assumption that the measurement information collectedfrom different users have different weights due to the vari-ance in user device, user mobility, user habit, etc. As net-work quality information collected from different usersshould not be treated equally, it is essential to estimate userweight and conduct weighted aggregation. In this part, wejustify this assumption through experimental results on ourdata collection of 3G networks that is described in Section 3.

As we discussed, there might be multiple user-relatedfactors that contribute to the user weight assignment. Herewe choose one factor as an example to illustrate how it mayaffect the network quality measurements – user device. Ter-minal devices used by end users come from different manu-facturers. These devices typically have various hardwareand software systems which lead to the difference in mea-sured signals. As some devices may present more accuratemeasurement values, the users who use such devices areregarded as providing better measurement information andreceive high weights. Therefore, we group users based onthe type of devices they are using, and we expect variabilityin user weights among different user groups according todevice types.

From our data collection, we derive user groups basedon device types as follows. We identify the type of devicefor each user, and find more than 500 types among all the

TABLE 3Running Example: Input Data

o1 o2 o3 o4 o5

u1 bad bad good medium goodu2 medium medium good good goodu3 medium good medium bad medium

TABLE 4Running Example: Aggregation Step in Round 1

o1 o2 o3 o4 o5

u1 bad bad good medium goodu2 medium medium good good goodu3 medium good medium bad medium

v�o medium bad good bad good

TABLE 5Running Example: Aggregation Step in Round 2

o1 o2 o3 o4 o5

u1 bad bad good medium goodu2 medium medium good good goodu3 medium good medium bad medium

v�o medium bad good medium good

TABLE 6Running Example: Aggregation Step in Round 3

o1 o2 o3 o4 o5

u1 bad bad good medium goodu2 medium medium good good goodu3 medium good medium bad medium

v�o bad bad good medium good

306 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 16, NO. 2, FEBRUARY 2017

users. There exist some missing entries, so totally, we finddevice type information for 120; 742 out of 127; 258 users,which accounts for 95 percent. In order to visually inspectthe difference in weights among user groups, we map the500 specific device types into 12 general types accordingto their brands.

We then partition the whole user set into groups basedon their general device types, and check the percentage ofusers with high weights within each group. We set a thresh-old and regard the users whose weight is larger than athreshold (16 in this experiment) as high-weight users.Table 7 shows the percentage of high-weight users for eachdevice type group. From this table, we can observe the vari-ance across different device types, which confirms the effectof user device.

To further demonstrate the difference in user weightsamong device type groups, we quantitatively show thepair-wise difference by t-test. Specifically, we perform t-teston the weight distributions of each pair of device typegroups, which outputs whether the two distributions aresignificantly different or not. The results of pairwise t-testwith 5 percent significant level are summarized in Table 8.Due to space limit, only the five most popular device typesare shown here. The pairwise difference can be clearlyobserved among most of the devices. Both Tables 7 and 8show the variation in user weights caused by device types.Other factors may also lead to the difference in the ability ofusers to provide measurement values. Therefore, it is criticalto take into account user weights in the aggregation of mea-surement values to reduce the effect of user-related factors.

5.2 Quality Evaluation

In this section, we validate the effectiveness of our proposedweighted aggregation method compared with majorityvoting. We first evaluate both methods on a synthetic data-set in which the “ground truth” information is available,and then show the comparison on a case study from thereal datasets.

5.2.1 Simulated Experiments

We randomly select 100 cells’ network quality during oneday (totally 100 trials), and these real trials implicitly covervarious factors that can affect the network quality of cells.These real trials are regard as the “ground truth” in evalua-tion. Then we simulate users with different weights. Foreach trial, we generate measurement values of 30 users, andeach user is given a parameter a (a 2 ½0; 1�). A low a indi-cates that the corresponding user has a low weight, and hismeasurement values are more likely to be affected by user-related factors. On the other hand, a high a indicates a highprobability of the user being unbiased and providing reli-able measurement values.

According to the requirement from network provider,the network quality measurements are discrete values.Thus, majority voting, i.e., Eq. (1) is adopted as the naturalbaseline method. Due to the randomness of simulation, werepeat each experiment 10 times and then report its mean asthe performance result.

Table 9 summarizes the results on the 100 trials of simu-lation. Note that due to the space limit, we cannot show allthe results on 100 trials, and only the results of 22 trials andthe average result of 100 trials are reported in Table 9. Errorrate is used as the performance metric, which is computed asthe percentage of aggregated outputs that are different fromthe ground truth. It is clear that the proposedmethod outper-forms the baseline across all the trials with large margins.The baseline method treats all the users equally and assignsthe sameweights to them, and thus the important differencesamong users are not recognized in the aggregation. Differentfrom the baseline, the proposed weighted aggregationapproach considers user-related factors and estimates userweights, which results in more accurate aggregation. The

TABLE 7Percentage of High-Weight Users for

Different Device Type Groups

Device type Percentage Device type Percentage

BlackBerry 13.56% LG 10.89%HTC 5.37% Motorola 7.12%Huawei 19.79% Nokia 10.81%iPad 10.28% Samsung 12.00%iPhone 5.87% Sony 13.15%Lenovo 8.52% ZTE 17.23%

TABLE 8Pairwise t-Test on User Weights of Device Type Groups

Nokia iPhone iPad Samsung HTC

Nokia NA 1 0 0 1iPhone NA 1 1 1iPad NA 0 1Samsung NA 1HTC NA

1 denotes the positive t-test result while 0 denotes negative ones.

TABLE 9Performance Comparison on Synthetic Dataset

Trial Error Rate Weight Distance

Proposed Method Voting Proposed Method Voting

1 0.0825 0.1640 0.0455 0.16512 0.0649 0.1272 0.0469 0.16263 0.0614 0.1412 0.0486 0.16204 0.0833 0.1518 0.0444 0.16395 0.0789 0.1465 0.0453 0.16246 0.0965 0.1754 0.0444 0.16297 0.0851 0.1500 0.0454 0.16258 0.0868 0.1702 0.0486 0.16459 0.0789 0.1535 0.0484 0.161510 0.0667 0.1500 0.0446 0.162811 0.0877 0.1570 0.0476 0.162912 0.0675 0.1491 0.0460 0.160913 0.0904 0.1693 0.0472 0.163114 0.0807 0.1474 0.0451 0.163515 0.0982 0.1754 0.0499 0.164016 0.0623 0.1456 0.0459 0.161117 0.0667 0.1395 0.0523 0.161718 0.0825 0.1614 0.0481 0.163319 0.0746 0.1605 0.0540 0.164020 0.0711 0.1579 0.0440 0.1633� � � � � � � � � � � � � � �99 0.0754 0.1421 0.0430 0.1639100 0.0737 0.1500 0.0452 0.1632

Average 0.0781 0.1541 0.0468 0.1632

LI ETAL.: A WEIGHTED CROWDSOURCING APPROACH FOR NETWORK QUALITY MEASUREMENT IN CELLULAR DATA NETWORKS 307

results demonstrate the necessity of modeling user weightsin the aggregation process. We also compute the distancebetween the estimated user weights and the true userweights to illustrate the effectiveness of the proposedmethod in user weight estimation. We can see that the userweights output by the proposed method simulate true userweights well (less than 0.1 in distance to true user weights,much smaller than those of baseline).

5.2.2 Case Study

We run both the proposed weighted aggregation methodand majority voting baseline on the collected 3G networkdatasets and we compare the obtained network qualityresults of these two methods. In total, 11:2 percent of theresults from these two methods are different. Among them,we randomly choose one case to show how they are differ-ent and which result is more reasonable.

In this case study, we are interested in the network qual-ity of a particular cell at a particular time T0. Eight usersprovide network quality measurement values for this cell atthis time. Three of them claim that the network quality is“good” while four users claim it as “medium” and one userclaim it as “bad”. Clearly, the majority voting baseline out-puts “medium” as the aggregated value because “medium”is the majority. However, the proposed method outputs“good” because it assigns higher weights to the users thatclaim “good” in user weight estimation. To check whichoutput is more reasonable, we summarize the network qual-ity and traffic information immediately before and after T0

to infer the quality at T0 (shown in Table 10).The target time stamp is T0 as we try to detect the network

quality for this time stamp. From Table 10, we can observethat before and after T0, the network quality is “good”. Net-work traffic does not show significant increases at T0 com-pared with other time stamps. Therefore, it is reasonable toassume that network quality at T0 should be similar to that atthe other time, which is “good”. This justifies the output ofthe proposed weighted aggregation method. Comparedwith majority voting that treats all users equally, our pro-posed approach is able to obtain accurate estimation of userweights by considering the effect of user-related factors, andthus can outputmore reasonable aggregatedmeasurement.

5.3 Network Quality Patterns

We conduct the user weight estimation and weighted aggre-gation on the network quality measurement data collectedfrom users. The algorithm outputs an aggregated networkmeasurement value for each cell at each time. As we do nothave the ground truth network quality, we conduct case

studies to show the evolution of aggregated network qualityvalues for some selected cells as time elapses. We find thatcells’ network quality changes in various ways, and each ofthem is the result of various reasons: weekdays or weekends,locations, users’ 3G data plans, weather conditions, etc. Theevolution of network quality for different cells are differentas each cell may have different sets of users and environmen-tal factors, but we identify some common patterns that indi-cate the effect of factors. Nextwe show some of these patternsand provide meaningful explanations by checking the possi-ble factors thatmay cause the changes in network quality.

In the following case studies, we focus on explaining thenetwork quality patterns from the perspective of networkusage. We analyze several factors that may affect the usageof the network and in turn affect network quality. Typically,a large amount of users and heavy usage of networks mayresult in the quality degrade while a small number of usersand light usage contribute to quality improvement. To fur-ther demonstrate this connection, we extract cell-level net-work traffic information by correlating IP packet traces,PCHR logs and IMEI database, and show the relationshipbetween network traffic and network quality.

Weekdays or weekends. In downtown or residence areas,people exhibit very different network usage behavior dur-ing weekdays and weekends. Fig. 1 shows the networkquality evolution during a week for some cells in downtownwith many companies in the area, where x-axis is thedate and y-axis shows the percentage of “bad”, “medium”and “good” network quality during the particular day. Ascan be seen, in weekdays the percentage of “bad” and“medium” network quality is larger than the ones in week-end. For example, in Fig. 1b, we observe that the networkquality on Sunday is maintained as “good” 99 percent ofthe time, while on weekdays, the network quality drops to“medium” or “bad” cases much more often (around11 percent of the time in a day). This is a reasonable estimateas during weekdays many users go to work in this area, anda heavy usage of network leads to the degrade in networkquality. On the other hand, during weekends (third day andfourth day in Fig. 1a, first day in Fig. 1b), network quality isgood as most of the people stay at home.

To confirm the reasoning about this pattern, Fig. 2 plotsthe comparison between network quality and traffic for thesame cell in Fig. 1a. In Fig. 2, x-axis is the time (by hour) fora particular day, and y-axis is the network quality or trafficduring that hour. Showing all the days may need too muchspace, thus we select one day for weekday and one day for

TABLE 10Network Information for Case Study

Timestamp Network Quality Network Traffic (Byte)

T�3 good 1,205,341T�2 good 67,210T�1 good 13,028T0 ? 22,203T1 good 20,771T2 good 11,535T3 good 18,339

Fig. 1. Effect of weekdays on cells’ network quality.

308 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 16, NO. 2, FEBRUARY 2017

weekend to illustrate the comparison. It is clear that week-days witness traffic burst compared with weekends, whichsupports our claim that networks experience heavier usageand lower quality during weekdays in this area as discussedabove. Please note that for some timestamps, traffic burstoccurs while the network quality remains good, which iscaused by the fact that the number of users within thosetimestamps is relatively small. This indicates that the net-work quality is affected by network traffic, but the networkquality is not purely determined by the traffic only.

In Fig. 3, we present patterns in network quality for resi-dential areas. In contrast to the downtown locations, the net-work quality in residential areas decreases duringweekends (Third day and fourth day in Fig. 3a, first day inFig. 3b) as well as weekday evenings (eighth day in Fig. 3aand the rest days in Fig. 3b) due to heavier usages. Duringbusiness hours, the network quality is better as people go totheir working places.

These patterns we find in network quality with respect toweekdays and weekends are commonly observed and

repeat themselves periodically. Such patterns capture users’everyday mobility activities and can be used to predict net-work usage behavior. Network providers can accordinglyadjust network parameters to provide network services ofbetter quality.

Locations. Figs. 4 and 5 show that there exist cells withconstantly high or low network quality. In Fig. 4, the cellshave good quality all the time, while the cells shown inFig. 5 always exhibit poor network quality. We identify thelocations of these cells and find that places with few peopleand lighter usage loads typically have high-quality networksbut places with constantly high population and frequent net-work usage have low-quality networks. The locations shownin Fig. 4 are isolated places near airport or highway, whilethe locations of Fig. 5 are popular places such as schools andattractions. Further, Fig. 6 shows the network traffic of a celllocated at a popular attraction. From this figure, we observethat network traffic keeps at a high level most of the time soit is expected to have low network quality.

These patterns reveal the variation in base station loads,thus can be useful for better scheduling in base station con-struction. With these observed patterns in network quality,network providers can have a better idea on where theyshould build more base stations to improve the networkquality of that area.

Data plan. In Fig. 7, we present an interesting patternrelated to data plan usage. We can see that on December1st, the percentage of “medium” and “bad” network qualityare between 20 and 30 percent while for the rest days it isbelow 10 percent. This is in fact caused by the policy ofusers’ data plans. In the collected datasets, the billing cycleis always the first day of each month. A data plan has vol-ume limit each month, and users typically have very limited

Fig. 2. Network quality and corresponding traffic for Fig. 1a.

Fig. 3. Effect of weekends on cells’ network quality.

Fig. 4. Effect of locations on cells’ network quality.

Fig. 5. Effect of locations on cells’ network quality.

Fig. 6. Network quality and corresponding traffic for Fig. 5a.

LI ETAL.: A WEIGHTED CROWDSOURCING APPROACH FOR NETWORK QUALITY MEASUREMENT IN CELLULAR DATA NETWORKS 309

quota left at the end of the month. To avoid usage overflow,they tend to decrease their usage of 3G network work.When a new month starts, users get sufficient data quota,and start to surf the web immediately. This causes the net-work usage go up and network quality go down during thefirst day of a month.

In Fig. 8, we show one cell’s network quality and corre-sponding traffic on December 1st. As a comparison, we alsoplot this cell’s network quality and traffic for another day. Itis obvious that on December 1st, heavy usage occurs at non-peak hours (e.g., the midnight on this day), which validatesthe above explanation towards this interesting pattern.

This pattern shows the effect of data plan policy on net-work quality. Network providers may consider to modifyexisting policy to promote balanced usage among eachmonth. For example, promotion offers can be made toencourage users to use 3G networks even at the end of theirmonthly plan, or users’ plan can start at any day in a monthto avoid competition for network resources at the beginningof a month.

Weather conditions. Another factor that causes changes innetwork quality is the condition of weather. The effect ofweather condition is obvious: Network quality of certain cellsdrops when weather becomes bad. Table 11 shows theweather conditions during eight days and Fig. 9 plots the

network quality changes of a cell in the data collection of cityA. As an example, we find that in the morning of eighth dayin the data collection of City A, it rains heavily so the networkquality of that day is bad comparedwith the other days. Withreal-time network quality monitoring, network providers canrecognize such problems and react timely to fix problems orissues that lead to the performance drop in network services.

In this section, we explain the aggregated quality measurewe obtain using our proposed crowdsourcing aggregationapproach by analyzing the connection of aggregated networkquality to several environmental factors. Note that the caseswe show may have one or two outstanding factors, but stillnetwork quality is influenced by multiple factors jointly.Therefore, instead of modeling these factors one by one, thebest practice is to deploy a network qualitymonitoring systemthat can provide network providers accurate network qualitymeasurements to help them adjust network services. The pro-posed aggregation frameworkwill be a key component of thisquality monitoring system because it provides robust andaccurate estimates of network quality in a cost-effectiveway.

5.4 Running Time

In this part, we show the efficiency and scalability of the pro-posed approach. In Section 2.4, the time complexity of theproposed approach is analyzed, and it shows that the run-ning time of the proposed method is linear respect to thenumber of claims. To experimentally demonstrate this result,we sample different number of claims from both of the tworeal datasets and record the running time of the proposedapproach. Further, the running time of majority voting isalso reported, which has the optimal efficiency. From Fig. 10,we can observe that although the proposed method requires

Fig. 7. Effect of data plans on cells’ network quality.

Fig. 8. Network quality and corresponding traffic for Fig. 7a.

TABLE 11Weather Conditions for City A During Eight Days

Day Weather Condition Day Weather Condition

Nov 25 clear Nov 29 clearNov 26 clear Nov 30 clearNov 27 clear Dec 1 fogNov 28 clear Dec 2 rain

Fig. 9. Effect of weather conditions on a cell’s network quality.

Fig. 10. Running time on collected datasets.

310 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 16, NO. 2, FEBRUARY 2017

more computation time than majority voting, both of themhave linear complexity with respect to the number of claims.

In Section 2.6, we mention that the proposed method canbe deployed on parallel system, which enables the proposedmethod to deal with large-scale data efficiently. To demon-strate this advantage, we implement the parallel version ofthe proposed method based on Hadoop, and repeat theabove experiments on a 15-node Dell Hadoop cluster withIntel Xeon E5-2403 processor. For the purpose of compari-son, we report the running time of both the single-machineversion and MapReduce version of the proposed methodin Table 12. When the number of claims is small, the parallelversion of the proposed method requires a bit more compu-tation time due to the communication cost among parallelsystem. However, when the number of claims increases, thebenefit of the parallel version becomes significant. Theseobservations demonstrate that the proposed method can beapplied to network services in real practice to facilitate real-time network quality monitoring.

5.5 Summary of Insights

We proposed a novel crowdsourcing approach to overcomethe limitations of traditional drive testing approach so thatnetwork providers can effectively monitor network quality.The proposed approach is based on a weighted aggregationof user-provided measurements where users are weightedaccording to their user-related factors. This is motivated bythe fact that user-provided network information could beaffected by user device, user mobility, user habit or otheruser-related factors. In Section 5.1 we validate this assump-tion by checking the user weight assignment and its rela-tionship with device types.

We show the effectiveness of the proposed approach byconducting quantitative analysis in Section 5.2. First, theevaluation is conducted on synthetic datasets, in which theground truth information is available to conduct quantita-tive analysis. The error rate of the proposed methods is onlyaround a half of the error incurred by the baseline method.Then we show how and why the proposed weighted aggre-gation method outperforms baseline method through realcase study.

Then we apply the proposed crowdsourcing approach ontwo large-scale real-world datasets involving 616; 796 usersand 22; 715 cells. Based on the aggregated results, we showsome network quality evolutionary patterns in Section 5.3.From these patterns, we can clearly identify the impact ofweekdays and weekends, user data plans, etc. on networkquality. These patterns are further validated by comparingnetwork quality and the corresponding traffic. According tothese patterns, network providers can adjust cell access con-trol, scheduling and planning to avoid quality decrease andprovide better services to network users.

Last but not the least, we demonstrate that the proposedmethod runs in linear time with respect to the size ofthe dataset, and we also implement the parallel versionof the proposed method based on Hadoop. This shows thatthe proposed method has good scalability and is suitablefor real-time quality monitoring.

6 RELATED WORK

There are two types of network quality evaluation methodsto infer users’ experience: Objective and subjective methods.This partition is based on the parameters or data used toestimate the network quality: Subjective methods requireusers’ personal experience reports as input while objectivemethods use various network measurements.

An example of subjective methods is [2], in which theauthors proposed a framework to measure network qualityin a subjective way: Users can click a button on the interfacewhen they are unhappy about the current network quality.In practice, as subjective methods need the cooperationfrom users, it may not be able to be deployed in large scaleeasily.

For objective methods, various parameters and perfor-mance metrics, such as network traffic and speech exchangerate, can be used to derive network quality. Among them,two typical places to extract information for network qualityinference are from data-plane and control-plane. In data-plane performance analysis, the traffic of GPRS/UMTSnetwork is characterized by TCP performance [8] andround-trip time of TCP flow data [9]. In [10], the similaritiesand differences between the traffic of CMA2000 networkand wireline data traffic are presented. Furthermore, thetraffic dynamics of cellular devices are characterizedaccording to device types and application dimensions in[11]. As for control-plane performance analysis, in [12], datapacket headers and various signaling messages are collectedfrom a national 3G network, and used to analyze their tem-poral and spatial variations. In [13], the authors analyze theRRC state transitions of user sessions using collected traces.

The main difference between the above related workand our work is: The related work studies the network qual-ity measurement for individual users, while our work is inan orthogonal perspective that we measure the networkquality for network providers. We propose a weightedcrowdsourcing approach by aggregating individual users’information.

The proposed approach estimates user weights to cap-ture user-related factors. Some existing work recognizes thediversity of user behaviors, and thus motivates and sup-ports our method to estimate user weights. For example,several papers investigate the variance and diversity insmartphone usage as follows. In [14], the authors presenteda detailed analysis of traffic on smartphones with 43 users.Later, they found the diversity in user behavior by analyz-ing the dataset collected from 225 individual users [15]. [16]identifies diverse usage behavior of smartphones from theperspective of apps on national-level networks. In [17], theauthors studied the geospatial correlations between net-work traffic and application usage from user perspective.Note that although the diversity among users has beenrecognized, such information is not taken into account in

TABLE 12Running Time Comparison on Large-Scale Datasets

#claim Single-Machine MapReduce

1,000,000 27 s 100 s10,000,000 288 s 192 s100,000,000 2860 s 669 s193,480,418 5633 s 984 s

LI ETAL.: A WEIGHTED CROWDSOURCING APPROACH FOR NETWORK QUALITY MEASUREMENT IN CELLULAR DATA NETWORKS 311

network quality evaluation previously. As far as we know,this is the first work to characterize user-related factors andtake such valuable information in the task of network qual-ity measurement aggregation.

Recently, [18] studied the phenomenon that duringcrowded events, cellular network quality degrades. Basedon their findings, two mechanisms were suggested toimprove network quality. It is interesting to see that one ofour observed patterns about users’ data plan (Fig. 7) can beconsidered as a crowded event, which is consistent withtheir observation.

In [19], the authors studied the effect of network protocoland application behavior on the performance of 4G LTE net-work. Although we use a 3G network dataset in this paper,the proposed method can be adapted to 4G networks, whichis considered to be future work. We also plan to explore theaggregation of other metrics for network quality, such asthroughput, loss rate, or other contexts of data [20].

7 CONCLUSION

Network quality is a very important metric for network pro-viders to consider when making scheduling and planningdecisions to ensure satisfactory user experience. The tradi-tional drive testing approach to measure network qualitysuffers from the limitations of high cost, limited coverage,and out-of-date measurement values. In this paper, wepropose a novel strategy to report up-to-date and accurateestimates of network quality with little costs. The basic ideais to collect measurement values from a large amount ofusers who participate in the networks and aggregate themeasurement values to obtain consensus quality measure-ment. We also propose a weighted aggregation method tocapture the fact that user-provided measurement values areaffected by user-related factors. The proposed aggregationmethod estimates user weights and aggregated networkquality measurement simultaneously using a joint optimiza-tion framework. The solution to the optimization problemimproves the user weight and network quality estimates iter-atively. Experimental results on two real datasets collectedfrom two cities’ 3G networks demonstrate the effectivenessand efficiency of the proposed aggregation strategy. Asshown in the results, network quality patterns are affectedby several possible environmental factors. The proposedmethod outperforms majority voting baseline in its ability ofbetter characterizing network quality and it is efficient withlinear time complexity.

REFERENCES

[1] I. Kostanic, N. Mijatovic, and S. D. Vest, “Measurement based qoscomparison of cellular communication networks,” in Proc. IEEEInt. Workshop Tech. Committee Commun. Quality Rel., 2009, pp. 1–5.

[2] K.-T. Chen, C.-C. Tu, and W.-C. Xiao, “Oneclick: A framework formeasuring network quality of experience,” in Proc. IEEE Int. Conf.Comput. Commun., 2009, pp. 702–710.

[3] K.-T. Chen, C.-Y. Huang, P. Huang, and C.-L. Lei, “Quantifyingskype user satisfaction,” in Proc. Conf. Appl., Technol., Archit., Pro-tocols Comput. Commun., 2006, pp. 399–410.

[4] Q. Li, Y. Li, J. Gao, B. Zhao, W. Fan, and J. Han, “Resolving con-flicts in heterogeneous data by truth discovery and source reliabil-ity estimation,” in Proc. ACM SIGMOD Int. Conf. Manage. Data,2014, pp. 1187–1198.

[5] D. P. Bertsekas, Non-Linear Programming, 2nd ed. Belmont, MA,USA: Athena Scientific, 1999.

[6] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge,U.K.: Cambridge Univ. Press, 2004.

[7] C.-T. Chu, S. K. Kim, Y.-A. Lin, Y. Yu, G. Bradski, A. Y. Ng, andK. Olukotun, “Map-reduce for machine learning on multicore,” inProc. Adv. Neural Inf. Process. Syst., 2006, pp. 281–288.

[8] F. Ricciato, F. Vacirca, and M. Karner, “Bottleneck detection inUMTS via TCP passive monitoring: A real case,” in Proc. ACMConf. Emerging Netw. Exp. Technol., 2005, pp. 211–219.

[9] J. Kilpi and P. Lassila, “Micro-and macroscopic analysis of RTTvariability in GPRS and UMTS networks,” in Proc. 5th Int. IFIP-TC6 Conf. Netw. Technol. Serv. Protocols Perform. Comput. Commun.Netw. Mobile Wireless Commun. Syst.., 2006, pp. 1176–1181.

[10] J. Ridoux, A. Nucci, and D. Veitch, “Seeing the difference in IPtraffic: Wireless versus wireline,” in Proc. IEEE Int. Conf. Comput.Commun., 2006, pp. 1–12.

[11] M. Z. Shafiq, L. Ji, A. X. Liu, and J. Wang, “Characterizing andmodeling internet traffic dynamics of cellular devices,” in Proc.ACM SIGMETRICS/Int. Conf. Meas. Model. Comput. Syst., 2011,pp. 305–316.

[12] U. Paul, A. P. Subramanian, M. M. Buddhikot, and S. R. Das,“Understanding traffic dynamics in cellular data networks,” inProc. IEEE Int. Conf. Comput. Commun., 2011, pp. 882–890.

[13] F. Qian, Z. Wang, A. Gerber, Z. M. Mao, S. Sen, and O. Spatscheck,“Characterizing radio resource allocation for 3G networks,” inProc. ACM SIGCOMMConf. Internet Meas., 2010, pp. 137–150.

[14] H. Falaki, D. Lymberopoulos, R. Mahajan, S. Kandula, andD. Estrin, “A first look at traffic on smartphones,” in Proc. ACMSIGCOMM Conf. Internet Meas., 2010, pp. 281–287.

[15] H. Falaki, R. Mahajan, S. Kandula, D. Lymberopoulos, R.Govindan, and D. Estrin, “Diversity in smartphone usage,” inProc. 8th Int. Conf. Mobile Syst., Appl. Serv., 2010, pp. 179–194.

[16] Q. Xu, J. Erman, A. Gerber, Z. Mao, J. Pang, and S. Venkataraman,“Identifying diverse usage behaviors of smartphone apps,” inProc. ACM SIGCOMMConf. Internet Meas., 2011, pp. 329–344.

[17] M. Z. Shafiq, L. Ji, A. X. Liu, J. Pang, and J. Wang, “Characterizinggeospatial dynamics of application usage in a 3G cellular datanetwork,” in Proc. IEEE Int. Conf. Comput. Commun., 2012,pp. 1341–1349.

[18] M. Z. Shafiq, L. Ji, A. X. Liu, J. Pang, S. Venkataraman, andJ. Wang, “A first look at cellular network performance duringcrowded events,” in Proc. ACM SIGMETRICS/Int. Conf. Meas.Model. Comput. Syst., 2013, pp. 17–28.

[19] J. Huang, F. Qian, Y. Guo, Y. Zhou, Q. Xu, Z. M. Mao, S. Sen, andO. Spatscheck, “An in-depth study of LTE: Effect of network pro-tocol and application behavior on performance,” in Proc. ACMConf. SIGCOMM, 2013, pp. 363–374.

[20] X. He, P. P. Lee, L. Pan, C. He, and J. C. Lui, “A panoramic view of3G data/control-plane traffic: Mobile device perspective,” in Proc.11th Int. IFIP TC 6 Conf. Netw., 2012, pp. 318–330.

Yaliang Li received the BS degree from NanjingUniversity of Posts and Telecommunications in2010. He is currently working toward the PhDdegree in the Department of Computer Scienceand Engineering, SUNY Buffalo. His researchtopics include truth discovery, text and web min-ing, privacy-preserving data mining, and datamining application in healthcare.

Jing Gao received the PhD degree from Com-puter Science Department, University of Illinois atUrbana-Champaign in 2011, and subsequentlyjoined SUNY Buffalo in 2012. She is an assistantprofessor in the Department of Computer Scienceand Engineering at SUNY Buffalo. Her researchinterest include data and information analysis witha focus on truth discovery, crowdsourcing, infor-mation integration, ensemble methods, miningdata streams, transfer learning, and anomalydetection. She is amember of the IEEE.

312 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 16, NO. 2, FEBRUARY 2017

Patrick P. C. Lee received the BEng degree(first-class Hons.) in information engineering fromthe Chinese University of Hong Kong in 2001, theMPhil degree in computer science and engineer-ing from the Chinese University of Hong Kong in2003, and the PhD degree in computer sciencefrom Columbia University in 2008. He is currentlyan associate professor in the Department ofComputer Science and Engineering, ChineseUniversity of Hong Kong. His research interestsinclude various applied/systems topics including

storage systems, distributed systems and networks, operating systems,dependability, and security. He is a member of the IEEE.

Lu Su received MS degree in statistics and thePhD degree in computer science both from Uni-versity of Illinois at Urbana-Champaign, in 2012and 2013, respectively. He is an assistant profes-sor in the Department of Computer Science andEngineering at SUNY Buffalo. His researchfocuses on the general areas of cyber-physicalsystems, wireless and sensor networks, andmobile computing. He has also worked at IBMT. J. Watson Research Center and NationalCenter for Supercomputing Applications. He is

a member of the ACM and IEEE.

Caifeng He received the bachelor’s and master’sdegrees in computational mathematics both fromJilin University in 2009 and 2011, respectively.Her research interests include traffic analysis,fault demarcation, users profiling in mobile com-munication networks and streaming machinelearning algorithms for supporting on-line appli-cations in mobile networks.

Cheng He received the BS and MS degrees incommunication and information systems fromUESTC, Chengdu, in 2003 and 2006, respec-tively. He joined Huawei Research Institute in2006, and now he is a principal engineer andresearch manager in Noahs Ark Lab of Huawei.His research interests include traffic measure-ment & modeling, distributed stream computing,and big data stream mining & online learning forintelligent network management and control.

Fan Yang received the BS and MS degrees inelectrical engineering from XiDian University, in2007 and 2010, respectively. His research inter-ests include telecom network optimization, net-work traffic measurement and modeling, andinternet financial data modeling.

Wei Fan received the PhD degree in computerscience from Columbia University in 2001. He iscurrently the senior director and the deputy headof Baidu Big Data Lab in Sunnyvale, CA. Hismain research interests and experiences are invarious areas of data mining and database sys-tems, such as, deep learning, stream computing,high performance computing, extremely skeweddistribution, cost-sensitive learning, risk analysis,ensemble methods, easy-to-use nonparametricmethods, graph mining, predictive feature discov-

ery, feature selection, sample selection bias, transfer learning, time seriesanalysis, bioinformatics, social network analysis, novel applications, andcommercial data mining systems. His co-authored papers receivedICDM’06/KDD11/KDD12/KDD13/KDD97 Best Paper & Best Paper Run-ner-up Awards. He led the team that used his Random Decision Tree(www.dice.com) method to win the 2008 ICDMData Mining Cup Champi-onship. He received the 2010 IBM Outstanding Technical AchievementAward for his contribution to IBM Infosphere Streams. He is an associateeditor of the ACM Transactions on Knowledge Discovery and Data Min-ing. During his time as the associate director of the Huawei Noah’s ArkLab in Hong Kong from August 2012 to December 2014, he led his col-leagues to develop Huawei StreamSMARTwhich is a streaming platformfor online and real-time processing, query, andmining of very fast stream-ing data. StreamSMART is three to five times faster than STORM and 10times faster than SparkStreaming, and was used in Beijing Telecom,Saudi Arabia STC, Norway Telenor, and a few other mobile carriers inAsia. Since joining Baidu Big Data Lab, he has been working on medicaland healthcare research and applications, such as deep learning-baseddisease diagnosis based on NLP input aswell asmedical dialogue robot.

" For more information on this or any other computing topic,please visit our Digital Library at www.computer.org/publications/dlib.

LI ETAL.: A WEIGHTED CROWDSOURCING APPROACH FOR NETWORK QUALITY MEASUREMENT IN CELLULAR DATA NETWORKS 313