Research Article Extended Privacy in Crowdsourced Location...

14
Research Article Extended Privacy in Crowdsourced Location-Based Services Using Mobile Cloud Computing Jacques Bou Abdo, 1 Thomas Bourgeau, 2 Jacques Demerjian, 3 and Hakima Chaouchi 4 1 Computer Science Department, Notre Dame University-Louaize, Zouk Mosbeh, P.O. Box 72, Lebanon 2 UPMC, Sorbonne University, LIP6, Paris, France 3 Faculty of Sciences, Lebanese University, LARIFA-EDST, Pierre Gemayel Campus, Fanar, Lebanon 4 Telecom SudParis, Institut Telecom, CNRS SAMOVAR, UMR 5751, 9 rue Charles Fourier, 91011 Evry, France Correspondence should be addressed to Jacques Bou Abdo; [email protected] Received 28 January 2016; Accepted 19 June 2016 Academic Editor: Michele Amoretti Copyright © 2016 Jacques Bou Abdo et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Crowdsourcing mobile applications are of increasing importance due to their suitability in providing personalized and better matching replies. e competitive edge of crowdsourcing is twofold; the requestors can achieve better and/or cheaper responses while the crowd contributors can achieve extra money by utilizing their free time or resources. Crowdsourcing location-based services inherit the querying mechanism from their legacy predecessors and this is where the threat lies. In this paper, we are going to show that none of the advanced privacy notions found in the literature except for -anonymity is suitable for crowdsourced location-based services. In addition, we are going to prove mathematically, using an attack we developed, that -anonymity does not satisfy the privacy level needed by such services. To respond to this emerging threat, we will propose a new concept, totally different from existing resource consuming privacy notions, to handle user privacy using Mobile Cloud Computing. 1. Introduction Mobile Cloud Computing (MCC) is a very promising tech- nology supported by major Cloud Service Providers (CSPs), mobile operators, and mobile vendors. CSPs are even offering Cloud platforms (such as Google Cloud Platform [1]) to be utilized by 3rd-party development companies, which will result in a boom in mobile Cloud applications. Some of the applications might have similar features forcing platform developers to thrive for a competitive edge to make their products profitable. In this highly competitive industry, an application’s survival is critically based on its capability to respond to users’ preferences. Mobile Cloud tries to satisfy user’s requirements by offering extensive computa- tion/storage resources which are accessible through mobile networks and decrease the mobile’s power consumption. However, these resources are not always enough to meet customer needs. Although computers’ intelligence and processing power are drastically increasing, search engines are still bounded to factual answers and fail in providing personal opinions which are more valuable [2] and could only be provided by human interaction. Crowdsourcing is broadcasting tasks, used to be executed by machines or employees, into an external set of people (contributors) [3, 4], as shown in Figure 1. Each of the crowd contributors has a pool of expertise that spans across different categories such as topic, language, geographic location, age, gender, and education. When a request (outsourced task) is sent, the crowdsourc- ing server evaluates it and retrieves the expertise needed for successful execution. e contributors matching the needed expertise are then selected to respond to the request. Crowdsourcing can be implemented in different applica- tions and scenarios such as the following: (i) “Social search engine” [4] is one of the most popular crowdsourcing application themes which focuses on answering context-related questions using human help (crowd) instead of or complementary with search Hindawi Publishing Corporation Mobile Information Systems Volume 2016, Article ID 7867206, 13 pages http://dx.doi.org/10.1155/2016/7867206

Transcript of Research Article Extended Privacy in Crowdsourced Location...

Page 1: Research Article Extended Privacy in Crowdsourced Location ...downloads.hindawi.com/journals/misy/2016/7867206.pdf · popular application o ering crowdsourced location-based service

Research ArticleExtended Privacy in Crowdsourced Location-Based ServicesUsing Mobile Cloud Computing

Jacques Bou Abdo1 Thomas Bourgeau2 Jacques Demerjian3 and Hakima Chaouchi4

1Computer Science Department Notre Dame University-Louaize Zouk Mosbeh PO Box 72 Lebanon2UPMC Sorbonne University LIP6 Paris France3Faculty of Sciences Lebanese University LARIFA-EDST Pierre Gemayel Campus Fanar Lebanon4Telecom SudParis Institut Telecom CNRS SAMOVAR UMR 5751 9 rue Charles Fourier 91011 Evry France

Correspondence should be addressed to Jacques Bou Abdo jbouabdonduedulb

Received 28 January 2016 Accepted 19 June 2016

Academic Editor Michele Amoretti

Copyright copy 2016 Jacques Bou Abdo et al This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited

Crowdsourcing mobile applications are of increasing importance due to their suitability in providing personalized and bettermatching replies The competitive edge of crowdsourcing is twofold the requestors can achieve better andor cheaper responseswhile the crowd contributors can achieve extra money by utilizing their free time or resources Crowdsourcing location-basedservices inherit the querying mechanism from their legacy predecessors and this is where the threat lies In this paper we are goingto show that none of the advanced privacy notions found in the literature except for 119870-anonymity is suitable for crowdsourcedlocation-based services In addition we are going to prove mathematically using an attack we developed that 119870-anonymity doesnot satisfy the privacy level needed by such services To respond to this emerging threat we will propose a new concept totallydifferent from existing resource consuming privacy notions to handle user privacy using Mobile Cloud Computing

1 Introduction

Mobile Cloud Computing (MCC) is a very promising tech-nology supported by major Cloud Service Providers (CSPs)mobile operators andmobile vendors CSPs are even offeringCloud platforms (such as Google Cloud Platform [1]) to beutilized by 3rd-party development companies which willresult in a boom in mobile Cloud applications Some ofthe applications might have similar features forcing platformdevelopers to thrive for a competitive edge to make theirproducts profitable In this highly competitive industry anapplicationrsquos survival is critically based on its capabilityto respond to usersrsquo preferences Mobile Cloud tries tosatisfy userrsquos requirements by offering extensive computa-tionstorage resources which are accessible through mobilenetworks and decrease the mobilersquos power consumptionHowever these resources are not always enough to meetcustomer needs

Although computersrsquo intelligence and processing powerare drastically increasing search engines are still bounded to

factual answers and fail in providing personal opinions whichare more valuable [2] and could only be provided by humaninteraction

Crowdsourcing is broadcasting tasks used to be executedby machines or employees into an external set of people(contributors) [3 4] as shown in Figure 1 Each of the crowdcontributors has a pool of expertise that spans across differentcategories such as topic language geographic location agegender and education

When a request (outsourced task) is sent the crowdsourc-ing server evaluates it and retrieves the expertise needed forsuccessful execution The contributors matching the neededexpertise are then selected to respond to the request

Crowdsourcing can be implemented in different applica-tions and scenarios such as the following

(i) ldquoSocial search enginerdquo [4] is one of the most popularcrowdsourcing application themes which focuses onanswering context-related questions using humanhelp (crowd) instead of or complementarywith search

Hindawi Publishing CorporationMobile Information SystemsVolume 2016 Article ID 7867206 13 pageshttpdxdoiorg10115520167867206

2 Mobile Information Systems

Crowd

RequesterRequest

Reply

Crowdsourcingserver

Contributor

ContributorContributor

Contributor

Contributor

Figure 1 Crowdsourcing scenario

engines [5ndash7] Chacha [8] is a popular crowdsourcingapplication

(ii) ldquoCrowdsourced location-based servicerdquo [4] is amethodto fetch the recommendations about certain location-based categories posted by people with taste andinterest similar to the requester Foursquare [9] is apopular application offering crowdsourced location-based service (LBS)

In this paper we will be focusing on ldquocrowdsourcedlocation-based servicerdquo (crowdsourced LBS) but our workcould be extended to other types of applications Chorus[10] Foursquare [9 11] and CrowdSearcher [2] are threeexamples of crowdsourced LBS applications which requireuserrsquos location to be transmitted with the request in order toensure optimized task routing

Figure 2 shows a sample request sent to CrowdSearcherand the steps followed in the resolution mechanism Therequest is as follows find a good jobwith a suitable apartmentclose to a good drug rehabilitation center within my neigh-borhood (ie near my current location) The steps taken torespond to the above query are as follows

(1) The resolution mechanism searches for jobs suitablefor the requestorrsquos profile and houses offered for rentboth located in the requestorrsquos neighborhood

(2) The retrieved jobs and their offered salaries arecompared based on the crowdrsquos opinion The selectedcrowd has to have expertise in the requesterrsquos jobdomain and geographic location

(3) The drug rehabilitation centers close to the retrievedhouses are compared based on the crowdrsquos opinionThe second crowd has to have expertise in healthor social companionship and the studied geographiclocation

Resolution mechanism

Selectjob

Selecthouse

for rent

Ask foropinions

on job

Search fordrugrehabcenter

Ask forsuggestions

on rehabcenters

Query jobposition

features (role

Reply

Que

rym

echa

nism

Request

Reply

location etc)

Figure 2 Task division in crowdsourcing location-based services

RequesterQuery

Answer

Crowdsourcing serverNeighbor attacker

Insider attacker

Figure 3 Attacker model

(4) The ldquohouse-job-rehabilitation centerrdquo result havingthe highest crowd rate is sent back to the requestor

The square blocks in Figure 2 represent machine-basedtasks while the hexagon blocks represent crowd-based ones

As shown in the above scenario and Figure 2 Crowd-Searcher requires userrsquos location as part of the request in orderto filter out the crowd not belonging to the area the user isinterested in (since geographic location is one of the neededexpertise areas) The contributors (crowd) are not aware ofthe full query but respond only to small portions (eg Howdo you rate this rehab center) (last hexagon in Figure 2)

As userrsquos location is transmitted an attacker who islocated just before (or within) the crowdsourcing servercan capture the full query containing the requesterrsquos currentlocation as shown in Figure 3The attackermodel is discussedin Section 7

It is easy to relate userrsquos identity to the transmittedlocation and in turn to the request This identity-query dis-closure reveals private information about the userrsquos interestsaffiliations and future plans and possibly helps in trackinghim In our case the attacker can easily identify the requesterand deduce that he or somebody in his small family isa drug addict The attacker can sell this information tolocal drug dealers who can target the addict after leavingthe rehabilitation center The relapse rate (return back toaddiction after rehabilitation) is very high (40ndash60) [12 13]which makes the postaddict an easy prey

Crowdsourced LBS natively compromises the privacy ofall its users because it uses the same security mechanismsutilized by legacy LBS which contains well-known privacy-breaching vulnerabilities Research efforts were exerted to

Mobile Information Systems 3

LBSUser eNB

User identity ALocation 01001

Pseudonym ZDRFUser identity ZDRF

Location 01001

User identity ZDRFLocation 01001

User identity ZDRFRequested location

01110

User identity ZDRFRequested location 01110

Figure 4 Anonymization using pseudonym

ensure the privacy of legacy LBS users but all failed to preventidentity-query disclosure To the best of our knowledge nonehas taken advantage of crowdsourced LBSrsquos mobile Cloudnature to offer enhanced privacy

Matching userrsquos location to his queries has severe securityconsequences since location can be related to identity [14]and queries can be related to user preferences interests ideasand beliefs Being able to match userrsquos identity to preferencecan be used for customer profiling and behavior expectancyTyrant governments would be interested in matching queriesand posts countering their regimes to the identity andlocation of the activists Safeguarding this match is crucial tomaintain user privacy and sometimes user safety

We are going to show in this paper that all the advancedprivacy notions found in the literature used for anonymiza-tion are not suitable for LBS We are going to show also thatthe suitable privacy notions have proven vulnerabilities Wecan then deduce that location privacy in crowdsourced LBS isnotmet and could be easily breached thus the need for a newprivacy model is inevitable Finally we are going to propose anew privacymodel for crowdsourced LBS based on itsmobileCloud nature

The remainder of this paper is organized as followsSection 2 presents background information to make thispaper self-contained Section 3 surveys the latest and mostadvanced privacy notions found in the literature and provesthat none is suitable for crowdsourced LBS except 119870-anonymity Sections 4 5 and 6 discuss the ldquofrequency attackrdquowe developed against 119870-anonymity Section 4 shows themathematical model behind 119870-anonymity constraint whichis used to evaluate the privacy level of different locationprivacy preservingmechanisms (LPPMs) Section 5 proposesldquofrequency attackrdquo which is used to exploit the most secureLPPM ldquofootprintsrdquo Section 6 shows the simulation results ofldquofrequency attackrdquo to evaluate the privacy breach level

A new solution for this privacy issue is proposed in Sec-tion 7 by changing the computational environment FinallySection 8 concludes the paper

2 Related Work

In this section we present legacy LBS which is the prereq-uisite to understand crowdsourced LBS architecture and theused privacy preserving mechanisms We also present theused anonymization and privacy notion (119870-anonymity) Wethen survey the privacy requirements that should be main-tained by location privacy preserving mechanisms which arealso surveyed At the end of this section we survey the attackson119870-anonymity to show its vulnerabilities

21 Legacy Location-Based Services and 119870-AnonymityLocation-based service is a computer-level online servicethat utilizes the userrsquos current position as a critical inputfor the application providing this service [15] Locationcoordinates can be delivered through GPS equipped mobiledevices [16] or through the mobile userrsquos operator If amultilateration positioning technique is used in the servingnetwork then the exact location of the user can be specifiedIf multilateration is not used then only the distance to theserving eNB (evolved Node B) is delivered in addition tothe eNBrsquos coordinates in other words the user knows that itbelongs to the circumference of a circle centered at servingeNB its radius is the userrsquos distance to the center Note that

(i) location-dependent query is a user triggered requestto a location-based service

(ii) nearest neighbor query is a location-dependent queryrequesting the address of the nearest point of interest

Before being able to request any information fromlocation-based services the mobile user has to update hislocation by sending the coordinates to the LBS server whichin turn replies with the requested information Security of thelocation-related information transmitted over the air channelis considered to be outside the scope of this paper due to theimplemented confidentiality and integrity protection at theAccess Stratum (AS) layer We will only consider last mileeavesdropping (between PDN Gateway (P-GW) and LBSserver) carried out by outside attackers (neighbor attacks)or the service providers themselves (insider attacks) Captur-ing insecure identities and location information allows theattacker to breach the userrsquos privacy by being able to knowif the user is in a certain area and where precisely he is

Anonymization was proposed using pseudonym identi-ties as a cost-effective way to ensure identity privacy Identityanonymization is implemented at the mobile level wherea pseudonym is generated to replace the username in theLBS query as shown in Figure 4 Pseudonym anonymizationfailed to ensure the required degree of anonymity [17ndash19]because each user has a limited number of restricted areaswhich are known by the attacker and allow him to createa predetermined victim behavior These restricted areas areplaces visited regularly by the victim such as home officeand home-office road In other words the anonymizingtechnique is vulnerable to correlation attacks Although theusername is anonymized other remaining attributes calledquasi-identifiers can still be mapped to individuals (eg agesex and city [16 20 21])

4 Mobile Information Systems

Access point

Internet

Anonymity server

LBSLocationrequestanswer

Cloaked region

Figure 5 Anonymization using119870-anonymity

The scenario behind Figure 4 is that user A currentlyfound at location 01001 is interested in finding a certainservice which is returned by the LBS server as located at 01110

119870-anonymity is another proposedmethod to ensure loca-tion privacy where a location-dependent query is consideredprivate if the attacker is able to identify the requester withprobability less than 1119870 [22] 119870 is a threshold required bythe user [18]Thismethod uses an intermediate trusted servercalled anonymizer as shown in Figure 5

Every subscribed user has to register with a trustedanonymity server (anonymizer) by updating its identity andlocation This anonymizer maintains an updated databaseof userrsquos current location Each triggered query passes bythe anonymizer it replaces the userrsquos location by a cloakingregion (CR) and forwards the request to LBS server CRcontains in addition to the real user 119870 minus 1 users belongingto its neighborhood The attacker knows that one withinthis vicinity has requested this query but the probability ofidentifying the right user equals 1119870 The LBS server replieswith a list containing the identity of each user from CR withits corresponding point of interest (POI) (ie queryrsquos answer)The anonymizer then filters the POI corresponding to the realuser and passes it to the requester [17 18] 119870-anonymity alsohas drawbacks [17 18]

(i) The anonymizer is considered a single point of failureand bottleneck thus the probability of full outage ishigher than that in pseudonym anonymization

(ii) Malicious users can be physically located near thevictim thus the anonymizer will add these users inits CR Since the eavesdropper knows the malicioususers it can predict the userrsquos identity with probabilitygt 1119870

(iii) 119870-anonymity is also vulnerable to correlation attacks(iv) 119870-anonymityrsquos security level is directly proportional

to the frequency of usersrsquo location update It is notpractical and scalable to request location updatesperiodically from all users

(v) It is more profitable for idle users not to updatetheir location since moving from LTE IDLE toLTE ACTIVE to send this update message will causehigher battery consumption and excess signaling oncore level

(vi) Generated core traffic is 119870 minus 1 times more than whatis needed

(vii) Only identity and identity-query privacy are ensuredbut not location privacy

(viii) It is difficult to support continuous LBS

Crowdsourcing applications differ from legacy LBSs inthe resolution mechanism since one or more employeesrecruited from the crowd will participate in answering thequery but userrsquos interface and query mechanism (where theprivacy mechanisms lay) are slightly changed This makesthe privacy requirements and location privacy preservingmechanism that are used in legacy LBS remain valid incrowdsourcing applications

The privacy requirements applicable for legacy andcrowdsourced LBS are surveyed in the next subsection

22 Privacy Requirements of Crowdsourced LBS ApplicationsEvery crowdsourced LBS should maintain a set of privacyrequirements in order to be considered privacy preservingThese requirements are used to prevent the disclosure of sen-sitive information that could be used as part of more complexattacks The attacks range from tracking the user to profilinghis attitudes and interests LBSrsquos privacy requirements are asfollows

(i) Location privacy in crowdsourced LBS location-identity relationship is relatively easy to be exposedusing correlation attacks [14]

(ii) Identity privacy exposing a userrsquos nonrepudiatedidentity can expose him to location tracking byexploiting the well-known Evolved Packet System(EPS)Universal Mobile Telecommunication System(UMTS) tracking vulnerabilities [23 24]

(iii) Identity-query privacy LBS requests have no system-wide effect (ie wrong information will affect theuserrsquos request only without having side effects onother users) Identity traceback is not required

Location privacy preserving mechanisms (LPPMs) usedby LBS applications are surveyed in the next subsection

23 Location Privacy Preserving Mechanisms In this subsec-tion we survey the LPPMs that were proposed for legacy LBS[14] which are as follows

(i) Anonymization and obfuscation anonymization isreplacing the username part of the LBS query with

Mobile Information Systems 5

a pseudonym obfuscation is responsible for distort-ing the LBS queryrsquos second part (location) Variouspseudonym generation mechanisms are found suchas Mix zones [14]

(ii) Private Information Retrieval (PIR) both request andreply are encrypted leaving the server unable toidentify the sender or the request This mechanism issuitable for legacy LBS but impractical in crowdsourc-ing since the search engines need to send the crowda plaintext job

(iii) Feeling-based approach this is achieved by identi-fying a public region and sending a request withdisclosed location that must be at least as popular asthat space

24 Attacks on 119870-Anonymity In this subsection we surveythe attacks on 119870-anonymity To do so the following termsshould be defined first

(i) Quasi-identifiers a group of attributes that canuniquely identify an entity (eg age ZIP code)

(ii) Sensitive attributes information which results inprivacy breaching if matched to an entity

The term ldquocloaking region (CR)rdquo used in Section 2 isidentical to the term ldquoclassrdquo used by the authors of [25 26]who defined it as ldquoa set of records that are indistinguishablefrom each other with respect to certain identifying attributesrdquo[25]Wewill only use CR in this paper tomaintain coherence

Several attacks on 119870-anonymity have been discussed inthe literature which are as follows

(i) Selfish behavior based attack the authors of [27]showed the behavior of selfish users in ldquoanonymiza-tion and obfuscationrdquo LPPM and its effect on privacyprotection In the areas where the number of LBSusers is less than ldquo119896rdquo a requester can generate dummyusers so that its CR can again contain ldquo119896rdquo users(real and dummy) in order to achieve 119870-anonymityOther selfish users (free riders) can benefit from thegenerated dummy users to obfuscate their requestsThe cost of generating dummy users is not distributedfairly among the users benefiting from this LPPM

(ii) Homogeneity attack [25 26] this attack takes advan-tage of the fact that ldquo119870-anonymity creates groupsthat leak information due to lack of diversity in thesensitive attributerdquo [25] The attack expects that thesearched attribute (sensitive attribute) is the same forall the users within the sameCRAn example showingthis attack is implemented as follows consider themedical records of a certain population shown inTable 1 and its anonymized version published inTable 2 [25] Alice tries to find the medical situationof Bob who is included in Table 1 Bob is a 31-year-old Americanmale who lives in ZIP code 13053 Alicecan deduce that Bob has cancer since all the membersbelonging to his group (CR = 9 10 11 12) have thesame disease

Table 1 Medical records of a sample population

Nonsensitive (quasi-identifiers) SensitiveZIP code Age Nationality Situation

1 13053 28 Russian Heart disease2 13068 29 American Heart disease3 13068 21 Japanese Viral infection4 13053 23 American Viral infection5 14853 50 Indian Cancer6 14853 55 Russian Heart disease7 14850 47 American Viral infection8 14850 49 American Viral infection9 13053 31 American Cancer10 13053 37 Indian Cancer11 13068 36 Japanese Cancer12 13068 35 American Cancer

(iii) Background knowledge attack this attack takesadvantage of the fact that ldquo119870-anonymity does not pro-tect against attacks based on background knowledgerdquo[25] which are not included in the offered informa-tion An example showing this attack is implementedas follows Alice tries to know the medical situationof her friend Umeko who is admitted to the samehospital as Bob so his records are also shown inTable 1 Umeko is a 21-year-old Japanese female whocurrently lives in ZIP code 13068 thus belonging tothe first CR = 1 2 3 4 Alice can deduce thatUmeko has either viral infection or heart disease It iswell known that the Japanese have an extremely lowincidence of heart disease thus Alice is nearly surethat Umeko has a viral infection

As shown previously 119870-anonymity has various weak-nesses and it would be interesting if we are able to replaceit Advanced privacy notions have been proposed to replace119870-anonymity for microdata publishing It looks tempting toadopt these notions into LBS In the coming section we aregoing to show that none of the advanced privacy notionsfound in the literature except for119870-anonymity is suitable forcrowdsourced location-based services

3 119870-Anonymity 119897-Diversity and 119905-Closeness

119870-anonymity is considered not enough to ensure the privacyneeded for microdata publishing Microdata are tables thatcontain unaggregated information which include medicalvoter registration census and customer data They areusually used as a valuable source of information for theallocation of public funds medical research and trendanalysis Table 1 is a sample microdata table while Table 2is an anonymized microdata table Other stronger privacynotions were proposed in the literature which are 119897-diversity[25] and 119905-closeness [26 28]

The ldquo119897-diversityrdquo privacy notion states that a CR is consid-ered to have 119897-diversity if there are at least 119897 ldquowell-representedrdquovalues [25] ldquoWell-representedrdquo can be interpreted as follows

6 Mobile Information Systems

Table 2 Anonymized records of the sample population

Nonsensitive (quasi-identifiers) SensitiveZIP code Age Nationality Situation

1 130lowastlowast lt30 lowast Heart disease2 130lowastlowast lt30 lowast Heart disease3 130lowastlowast lt30 lowast Viral infection4 130lowastlowast lt30 lowast Viral infection5 1485lowast gt40 lowast Cancer6 1485lowast gt40 lowast Heart disease7 1485lowast gt40 lowast Viral infection8 1485lowast gt40 lowast Viral infection9 130lowastlowast 3lowast lowast Cancer10 130lowastlowast 3lowast lowast Cancer11 130lowastlowast 3lowast lowast Cancer12 130lowastlowast 3lowast lowast Cancerlowast in the ZIP code column refers to missing data lowastlowastmeans that 2 digits areanonymized lowast in the Nationality column is used to identify an anonymizednationality

(i) Distinct 119897-diversity it contains at least 119897 distinctvalues

(ii) Entropy 119897-diversity entropy of the values thatoccurred is greater than or equal to log 119897

(iii) Recursive 119897-diversity the most frequent value doesnot appear toomuch and the less frequent value doesnot appear too rarely

A table is considered to have 119897-diversity if all its CRs have119897-diversity Various attacks have been proposed against thisprivacy notion which results in privacy disclosure [26]Theseattacks are as follows

(i) Skewness attack [26] skewed distributions couldsatisfy 119897-diversity but still do not prevent attributedisclosure This happens when the frequency ofoccurrence of a certain sensitive attribute inside aCR differs from its frequency in the whole table(population) Consider the drug addiction rate in acertain population to be 1 One of the classes hasan addiction rate of 50 This class is consideredattribute disclosing since a member of this class isidentified as addict with a high probability

(ii) Similarity attack [26] distinct but semantically simi-lar sensitive values could satisfy 119897-diversity but meanthe same for the attacker If the sensitive informationshown in a table contains cocaine addict heroinaddict marijuana addict and so forth as distinctentries it could be considered satisfying 119897-diversitybut for a drug dealer all these entries are consideredas a target

The ldquo119905-closenessrdquo concept is another privacy notion thatdifferentiates between the information gained about thewhole population and that about specific individuals [26]As the first gain is tolerated and motivated the second isconsidered privacy breaching A class is considered to have

Table 3 Data stored at the LBS anonymizer

Quasi-identifiers Sensitive dataCloaking region Location Query True sender

CR1Location 1

Query 1No

Location 2 NoLocation 3 Yes

CR2Location 4

Query 2No

Location 5 YesLocation 6 No

Table 4 Data captured by the adversary

Quasi-identifiers Sensitive dataCloaking region Location Query True sender

CR1Location 1

Query 1lowast

Location 2 lowast

Location 3 lowast

CR2Location 4

Query 2lowast

Location 5 lowast

Location 6 lowast

lowast refers to an anonymized Boolean value (true or false)

ldquo119905-closenessrdquo if the distance between the distribution of asensitive attribute in the studied class and the distributionin the whole table (population) is less than or equal to athreshold 119905 The distance is measured using Earth MoverrsquosDistance (EMD)

It is shown in [26] that ldquo119905-closenessrdquo ensures higherprivacy than ldquo119897-diversityrdquo

31 119870-Anonymity versus 119897-Diversity and 119905-Closeness Thedifference in nature between microdata tables and LBS CRcaptured data is the main factor that prevents the implemen-tation of ldquo119897-diversityrdquo and ldquo119905-closenessrdquo as privacy metrics inLPPMs

Anonymizing microdata keeps the sensitive attributesintact and hides parts of the quasi-identifiers to preventuniquely identifying an entity An entity can only be iden-tified as an unknown record in a class ldquo119897-diversityrdquo and ldquo119905-closenessrdquo try to keep the relationship between the entities ofa class and its sensitive attributes as private as possible usingthe techniques discussed previously

LBS anonymization tries on the contrary to hide thesensitive data instead of generalizing the quasi-identifierswhich are needed by the location-based server (locationexpertise and query) to be able to respond to the userrsquos queryTable 3 represents the LBS data stored in the anonymizer

The anonymized data that are transmitted to the LBSserver and could be captured by the adversary are shown inTable 4

As can be seen in the microdata anonymization in Tables1 and 2 and LBS anonymization in Tables 3 and 4 the latterfocuses on hiding the sensitive data which totally negates theaim behind publishing microdata

Mobile Information Systems 7

Based on the above discussion we can deduce that 119897-diversity and 119905-closeness are not suitable for LBS anonymiza-tion thus 119870-anonymity remains the best suitable privacynotion suitable for such services

We have shown in the previous sections that LBS appli-cations are in need of privacy preserving mechanisms basedon valid privacy notions We have also shown that thealready implemented privacy notion119870-anonymity could beexploited using various attacks but stricter privacy notionsare not suitable We can conclude that LPPMs using 119870-anonymity are the best available solutions To prove the needfor a novel privacy concept we will present in the comingsections a newattack on119870-anonymity thatwehave developed[29]

4 Mathematical Model

We will prove in this section that 119870-anonymity is private ifldquozero prior knowledgerdquo is assumed and the adversary couldnot perform the attack for a long continuous period

Let 1199091 1199092 119909

119899 be the set of users found in a CR

where ldquo119899rdquo is its size and 119899 ge 1 Let ldquo119909rdquo be our investigateduser If 119909 notin 119909

1 1199092 119909

119899 that is 119909 notin CR then 119909 is for sure

not the requestor The probability of a user being added to aCR without being the true requestor is

119875 (119909 isin CR (119909119894= requestor amp 119909 = 119909

119894))

=119875 (119909 isin CR cap 119909

119894= requestor)

119875 (119909119894= requestor)

=119899 minus 1

119873 minus 1

(1)

where119873 is the number of users in the area controlled by thestudied anonymizer where 1 le 119899 le 119873 The probability of auser being added to a CR while being the true requestor is

119875 (119909 isin CR (119909 = requestor)) = 1 (2)

We can deduce that the probability of a user being addedto a CR is119875 (119909 isin CR)

=

119873

sum

119894=1

119875 (119909 isin CR119909119894= requestor) sdot 119875 (119909

119894= requestor)

(3)

If all the users have the same frequency of sendingrequests (ie 119909

119894are equiprobable ie 119875(119909

119894= requestor) =

1119873) we can deduce that119875 (119909 = requestor119909 isin CR)

=119875 (119909 = requestor119909 isin CR)

119875 (119909 isin CR)=

(1119873)

(119899119873)=

1

119899

(4)

Based on the assumption that all users are equiprobable(since the adversary has ldquozero prior knowledgerdquo and could notperform the attack for a long continuous period he cannotassume otherwise) 119870-anonymity is satisfied if the cloakingregion size is greater than119870 that is 119899 ge 119870 We will prove inthe coming section that the privacy offered by119870-anonymitycould be breached using our proposed attack (frequencyattack) if the adversary is able to perform the attack for a longperiod even if ldquozero prior knowledgerdquo is assumed

5 Frequency Attack

Until now we have considered that all users are equiprobablebut in reality they are not Consider an LBS which can beused by a user to retrieve the location of the nearest shop(bakery Chinese restaurant etc) While being home I amaware of the most important places which I usually accessand I know the bakeries barbers supermarkets and so forthwithin my town so I need no support in locating these placeswhile on the contrary I can help othersWhen visiting places Iamnew to it will be hard forme to find certain targets (metrosupermarket etc) especially when facing language problemsMy rate of asking for guidance at these new places is muchhigher than that while being atmy known region Let comfortzone be a set of areas where a certain user is familiar with andneeds no support from the studied LBS in finding places Auserrsquos comfort zone includes his hometown and workplaceOutside this comfort zone the query frequency of this userincreases

Based on the above discussion we can easily assumethat the usersrsquo frequencies are not equal that is exist119894119875(119909

119894=

requestor) = 1119873If a user ldquo119894rdquo is to be found in a CR then either he is the real

requestor or he is not and selected from the ldquo119873minus1rdquo remainingusers

If user ldquo119894rdquo is the real requestor (ie 119894 = 119909) then 119875(119909 isin

CR119909 = requestor) = 1 since the real requestor should be inthe CR else he will not receive a valid answer

If user ldquo119894rdquo is not the real requestor (ie 119894 = 119909) then 119875(119909 isin

CR119909 = requestor) = (119899minus1)(119873minus1) that is one of the ldquo119899minus1rdquoselected users from the ldquo119873 minus 1rdquo remaining users

Let 119875119894= 119875(119909

119894= requestor) then

119875 (119909 isin CR)

=

119873

sum

119894=1

119875 (119909 isin CR119909119894= requestor) sdot 119875 (119909

119894= requestor)

=

119873

sum

119894=1

119894 =119909

(119899 minus 1

119873 minus 1)119875119894+ (1) 119875

119909= (

119899 minus 1

119873 minus 1)

119873

sum

119894=1

119894 =119909

119875119894+ 119875119909

Since119873

sum

119894=1

119875119894= 1 then

119873

sum

119894=1

119894 =119909

119875119894= 1 minus 119875

119909

(5)

then

119875 (119909 isin CR) = (119899 minus 1

119873 minus 1) (1 minus 119875

119909) + 119875119909

=119899 minus 1

119873 minus 1+ (

119873 minus 119899

119873 minus 1)119875119909

(6)

let

119886 = (119873 minus 119899

119873 minus 1)

119887 = (119899 minus 1

119873 minus 1)

(7)

8 Mobile Information Systems

then

119875 (119909119894isin CR) = 119886119875

119894+ 119887 (8)

The adversary even if he has ldquozero prior knowledgerdquocan collect CRs used by the studied application through asilent passive attack These captured CRs help the attackerto estimate the requesting frequency of each user using thefollowing procedure

Let 119909119894119895be a random variable representing the occurrence

of user ldquo119894rdquo in the 119895th collected CR (CR119895) sdot 119909119894119895= 1 if user ldquo119894rdquo

(119909119894) is found in CR

119895and 119909

119894119895= 0 otherwise Considering that

the attacker has captured ldquo119898rdquo CRs the average number ofoccurrences of user ldquo119894rdquo in the collected CRs is

119883119894=

1

119898

119898

sum

119895=1

119909119894119895 (9)

We can estimate the population rate (the real rate of user119894) from the sample rate with a confidence interval as follows

119875 (119909119894isin CR) isin [119883

119894minus 120572119878radic

119898 minus 1

119898 119883119894+ 120572119878radic

119898 minus 1

119898] 997904rArr

119875 (119909119894isin CR)

isin

[119883119894minus 119905120572

119878radic119898 minus 1

119898 119883119894+ 119905120572

119878radic119898 minus 1

119898] if 119898 lt 30

[119883119894minus

120572119878

radic119898 119883119894+

120572119878

radic119898] elsewhere

(10)

ldquo119905rdquo follows Studentrsquos 119905-distribution of ldquo119898 minus 1rdquo degreesof freedom (using Bayesian prediction) and ldquo120572rdquo followsnormal distribution (using central limit theorem) We cannow deduce 119875

119894since

997904rArr 119875 (119909 = requestor119909 isin CR) =119875119909

sum119895isinCR 119875119895

=1

119899 (11)

We have shown in this section that even with ldquozeroprior knowledgerdquo the adversary is able to estimate the userrsquosrequest rate through a passive attack aiming to collect CRsThis attack takes advantage of the fact that not all users havethe same request rate that is each put a different amount oftime utilizing a certain application User request rate escalateswhen being outside of his comfort zone which helps indifferentiating him from other users with lower request rateAs the duration of this attack increases the attackerrsquos accuracyincreases Using the above estimations the adversary is ableto identify whether a user found in a CR is the real requestorwith a probability different than (greater or less than) ldquo1119899rdquowhich results in lower entropy (uncertainty) and thus betterpredictability and this breaches the119870-anonymity constraint

We will show in the coming section the entropy and rateof successful prediction of our proposed attack

6 Simulation Results

In this section we have simulated using a specially craftedC code 50 users within an anonymizerrsquos controlled zone

Rate of successful prediction using frequency attack

0

5

10

15

20

25

30

35

40

()

20 50 7010()

Estimated rate of successful prediction using K-anonymity

Figure 6 Prediction success rate for CR = 5

Rate of successful prediction using frequency attack

0

10

20

30

40

50

()

20 50 7010()

Estimated rate of successful prediction using K-anonymity

Figure 7 Prediction success rate for CR = 4

Each user has either low or high request rates Users with highrequest rate are considered outside their comfort zone andhave 10 times more requests than low rate users The privacylevel (entropy and rate of successful requestor prediction) isstudied for different CR sizes and for sufficient samples Thetraffic factor (high request ratelow request rate = 10) is notproved on any real application but assumed as a particularcase

In Figure 6 we can see the estimated rate of successfulprediction and the rates achieved by the frequency attack for acloaking region of size 5The119909-axis represents the proportionof the population with high request rates while the 119910-axisrepresents the average probability of predicting the requestorsuccessfully

It can be shown in Figure 6 that frequency attack hasincreased the prediction success compared to the estimatedrate by (30-20)20 = 50 when 70 of the populationare outside their comfort zone to (36-20)20 = 60when 20 of the population are outside their comfort zonedepending on the distribution of the requestors (percentageof users outside their comfort zone) In Figure 7 we cansee the estimated rate of successful prediction and the ratesachieved by the frequency attack for a cloaking region of size4 The 119909-axis and 119910-axis are similar to those in Figure 6

Mobile Information Systems 9

Entropy of the samples with CR = 4 using frequency attackEntropy of the samples with CR = 5 using frequency attack

3

32

34

36

38

4

20 50 7010()

Entropy of equiprobable users as estimated by K-anonymity

Figure 8 Entropy for frequency attack (CR = 4 CR = 5) andequiprobable users

It can be shown in Figure 7 that frequency attack hasincreased the prediction success compared to the estimatedrate by (34-25)25 = 30 when 70 of the populationare outside their comfort zone to (44-25)25 = 20when 20 of the population are outside their comfort zonedepending on the distribution of the requestors (percentageof users outside their comfort zone) In Figure 8 we cansee the entropy (uncertainty) estimated by 119870-anonymityThe 119909-axis represents the proportion of the population withhigh request rates while the 119910-axis represents mechanismrsquosentropy

It can be shown in Figure 8 that the entropy (uncertaintyof knowing the requestor) is lower when using frequencyattack for both CR sizes This means that the attacker is ableto estimate the real requestor with a probability greater thanldquo1119899rdquo which leads to a breach in the ldquo119870-anonymityrdquo privacyconstraint

We have shown in this section that our proposed attackis able to breach119870-anonymity with an incremental rate as theduration of this attack increases

Since we are not able to find any privacy notion that iscapable of ensuring elevated user privacy in crowdsourcedLBS we deduce that a solid privacy solution is needed andthis will be proposed in the coming section

7 The Proposed Solution

Ensuring high privacy level in crowdsourced LBS is not trivialin the following environments (attacker models)

(i) Untrusted service provider in most of the cases theuser has no signed contract with the crowdsourcedlocation-based server thus no privacy obligations areheld especially when no auditing is taking place

(ii) Hostile network crowdsourced LBSs have no meansof managing incoming trafficrsquos privacy level otherthan encryption

(iii) Untrusted service provider in hostile network mostof the LBSs are untrusted and connected to the Inter-net (hostile network) but enforce encrypted trafficThis attackermodel which is currently used results intwo attack categories (insider and neighbor) as shownin Figure 3 Figure 9 shows untrusted service providerin a hostile network

The best privacy can be achieved by a trusted serviceprovider when located within a trusted network In this casethere is no need to use LPPMs or anonymization since theattacker is neither able to access the sent queries (neighborattack) nor able to access the log files (insider attack)Any of the LPPMs found in the literature and surveyed inSection 2 generates both processing and traffic overheadthus eliminating the need for these mechanisms withoutaffecting the privacy level is definitely an achievement interms of security and performance

Since crowdsourced LBS users are in reality mobile usersour proposed solution for this privacy issue is based onour Mobile Cloud Computing architecture named OperatorCentric Mobile Cloud Architecture (OCMCA) [30] We aregoing next to describe briefly OCMCA

71 Operator Centric Mobile Cloud Architecture To decreasethe delay cost and power consumption and increase theprivacy mobility and scalability compared to other mobileCloud architectures we proposed to install a ldquoCloud serverrdquowithin the mobile operatorrsquos network as shown in Figure 10Its proposed position leads to the following

(i) Less delay the traffic does not need to pass throughthe Internet to reach the destined server

(ii) Less cost the same data contents could be deliv-ered to various users using 1 multicast channelwhich decreases congestion (scalability) and allowedcheaper pricing models

(iii) Less power consumption userrsquos jobs are computed atthe Cloud server which eliminates the need for in-house execution

(iv) More mobility a user can maintain a session with theCloud server as long as he is connected to the mobilenetwork Our mobile Cloud architecture benefitsfrom the handover mechanisms in place

(v) More scalability the Cloud servers are centralizedand accessible by all the users

(vi) More privacy the Cloud servers are managed by atrusted and secure entity as it will be shown in moredetail in the next subsection

Since users are usually connected with the same operatorfor long durations we recommend the userrsquos CSP to federatesome resources at the ldquoCloud serverrdquo and then offload theuserrsquos applications and environment settings In this caseall computation-intensive processing is implemented within

10 Mobile Information Systems

Serving GWPDN GW

Cellphone (mobile)eNodeB

MME

LBS

Hostile network Untrusted service provider

Internet

Figure 9 Untrusted service provider in hostile network

Serving GW

MBMS GW

BM-SC

PDN GW

Cloud server

Cellphone (mobile)

Smartphone (mobile)

eNodeB

MME

Trusted network Server under trusted monitoring (trusted server)

Figure 10 OCMCA

the mobile operatorrsquos network and the terminal generateddata are offloaded to the local Cloud without the need toaccess the Internet (which decreases the cost per bit ofthe transmitted data) This Cloud should be able to triggerbroadcast messages when needed 3GPP has standardizedmulticast and broadcast packet transmission in UMTS andbroadcast transmission in LTE through a feature namedMultimedia MulticastBroadcast Service (MBMS) which wasdefined in 3GPPrsquos technical specification as follows

(i) ldquoMBMS is a point-to-multipoint service inwhich datais transmitted from a single source entity to multiplerecipientsrdquo [30]

(ii) Physical broadcasting allows network resources to beshared when transmitting the same data to multiplerecipients [30]

(iii) Its architecture ensures efficient usage of radio-network and core-network resources especially inradio interface [30]

MBMS requires the introduction of two nodes to theEvolved Packet Core (EPC) network [30] which are asfollows

(i) MBMS GW which is responsible for the connectionswith content owners Cloud servers in our case

(ii) BM-SC (Broadcast Multicast Service Center) whichprovides a set of functions for MBMS user services

If MBMS is already enabled in a studied mobile operatorthen MBMS GW and BM-SC are already installed Enablingbroadcast in LTE allows the local Cloud located within themobile operatorrsquos premises to transmit data efficiently to theusers of one or more cells

Mobile Information Systems 11

InternetCellphone(mobile)

Smartphone(mobile)

Destination web callee at other operators

normal Cloud server and so fortheNodeB

Cloud server

BM-SC

Smartphone(mobile)

eNodeB

MBMS GW

MME

Serving GW

PDN GW

SGi

Smartphone(mobile)

Smartphone(mobile)

Normal mobile trafficUnicast Cloud trafficMulticast Cloud traffic

Figure 11 Traffic routes in OCMCA

The added nodes (MBMSGWBM-SC andCloud server)are transparent to normal mobile traffic Mobile traffic willpass as follows

(i) Normal mobile traffic eNodeB Serving GW andPacket Data Network GateWay (PDNGW) shown inFigure 11

(ii) Unicast Cloud traffic eNodeB Serving GW ldquoCloudserverrdquo Serving GW and eNodeB shown in Figure 11

(iii) Multicast Cloud traffic eNodeB Serving GW ldquoCloudserverrdquo BM-SC and MBMS GW shown in Figure 11

72 Privacy Guaranteeing Architecture One of the ldquoCloudserverrsquosrdquo capabilities is to offer Software-as-a-Service (SaaS)on behalf of application developers An application developercan request the operator to host his service (application)After reviewing the source code the operator agrees onhosting this service and signs an SLA with the developer

A similar protocol is implemented byApple before addingany application to Apple Store [31]The application developertrusts Apple for not spreading the source code and for billing(charging every downloaded instance)

In our case the LBS provider will delegate his applicationto be provided by the mobile operator on his behalf as shownin Figure 12

As can be seen in Figure 12 the LBS query does notneed to pass in a hostile network which eliminates neighborattacks Since the offered service is under the operatorrsquosmonitoring the provider becomes trusted and insider attacks

become eliminated The application verification and billingmethods suitable for the operator could be used withoutaffecting the proposed solution

Since LPPMs have been proposed to prevent an attackerwho captured LBS queries from identifying the requestor ortracing the users we can consider that LPPMs are not neededanymore in this scenario since attackers are not capable ofcapturing the transmitted requests Based on this discussionwe consider the following

(i) User identity privacy is maintained The operatorhas access to userrsquos real identity (IMSI MSISDNand name) and generates temporary identities (TMSIGUTI etc) to prevent real identity breach Userrsquosidentity privacy is maintained since the operator isconsidered a trusted entity but in all cases it alreadyhas access to this information thus our proposedsolution does not offer any new information that theoperator could benefit from

(ii) User location privacy is maintained The operatoralready has access to any userrsquos location and does notbenefit from additional sensitive information leadingto a transparent and strong location privacy solutionfor the user

8 Conclusion

User privacy in crowdsourced LBS is unacceptable in itscurrent state Available location privacy preserving mecha-nisms result in unnecessary overhead without being able to

12 Mobile Information Systems

Serving GW

MBMS GWBM-SC

PDN GW

Cloud server

Cellphone (mobile)

Smartphone (mobile)

eNodeB

MME

Trusted networkServer under trusted monitoring (trusted

server)

LBS

Delegatedservices

Figure 12 Crowdsourced LBS delegation

satisfy the needed requirementsWe have shown that even thestate-of-the-art approach such as119870-anonymity is not suitableto fulfill the required privacy needs as it is vulnerable toour defined frequency attack Thus a new privacy model isneeded and this is what we have proposed in this paper

The delegation of services into the Cloud server inside theoperatorrsquos trusted network prevents all the attacks proposedby the attacker model shown in Figure 3

Competing Interests

The authors declare that they have no competing interests

References

[1] Google Cloud httpsCloudgooglecom[2] A Bozzon M Brambilla and S Ceri ldquoAnswering search

queries with CrowdSearcherrdquo in Proceedings of the 21st AnnualConference on World Wide Web (WWW rsquo12) pp 1009ndash1018ACM Perth Australia April 2012

[3] B L Bayus ldquoCrowdsourcing new product ideas over timean analysis of the Dell IdeaStorm communityrdquo ManagementScience vol 59 no 1 pp 226ndash244 2013

[4] J Howe Crowdsourcing How the Power of the Crowd Is Drivingthe Future of Business Random House New York NY USA2008

[5] M F Bulut Y S Yilmaz and M Demirbas ldquoCrowdsourcinglocation-based queriesrdquo in Proceedings of the 9th IEEE Interna-tional Conference on Pervasive Computing and CommunicationsWorkshops (PERCOM Workshops rsquo11) pp 513ndash518 SeattleWash USA March 2011

[6] D Horowitz and S D Kamvar ldquoThe anatomy of a large-scalesocial search enginerdquo in Proceedings of the 19th InternationalWorld Wide Web Conference (WWW rsquo10) pp 431ndash440 RaleighNC USA April 2010

[7] G Chatzimilioudis A Konstantinidis C Laoudias and DZeinalipour-Yazti ldquoCrowdsourcing with smartphonesrdquo IEEEInternet Computing vol 16 no 5 pp 36ndash44 2012

[8] Chachacom httpwwwchachacom[9] Foursquarecom httpsfoursquarecom

[10] W S Lasecki RWesley J Nichols A Kulkarni J F Allen and JP Bigham ldquoChorus a crowd-powered conversational assistantrdquoin Proceedings of the 26th Annual ACM Symposium on UserInterface Software and Technology (UIST rsquo13) pp 151ndash162 StAndrews UK October 2013

[11] A Noulas S Scellato C Mascolo and M Pontil ldquoAn empiricalstudy of geographic user activity patterns in foursquarerdquo inProceedings of the International Conference on Weblogs andSocial Media AAAI Barcelona Spain July 2011

[12] How effective drug addiction treatment httpwwwdrugabusegovpublicationsprinciples-drug-addiction-treatment-re-search-based-guide-third-editionfrequently-asked-questionshow-effective-drug-addiction-treatment

[13] Drug Rehab Statistics httpwwwfuturesofpalmbeachcomdrug-rehabstatistics

[14] J Bou Abdo H Chaouchi and J Demerjian ldquoSecurity inemerging 4G networksrdquo in Next-Generation Wireless Technolo-gies 4G and Beyond pp 243ndash272 Springer London UK 2013

[15] Location-Based Service httpenwikipediaorgwikiLocation-based service

[16] M E Nergiz M Atzori and Y Saygin ldquoTowards trajectoryanonymization a generalization-based approachrdquo in Proceed-ings of the SIGSPATIAL ACM GIS International Workshop onSecurity and Privacy in GIS and LBS (SPRINGL rsquo08) pp 52ndash61November 2008

[17] S SteinigerMNeun andA Edwardes Foundations of LocationBased Services Lecture Notes on LBS v 10 2006

[18] D Quercia N Lathia F Calabrese G Di Lorenzo and JCrowcroft ldquoRecommending social events from mobile phonelocation datardquo in Proceedings of the IEEE 10th InternationalConference on Data Mining (ICDM rsquo 10) pp 971ndash976 IEEESydney Australia 2010

[19] B Jiang and X Yao ldquoLocation-based services and GIS inperspectiverdquo Computers Environment and Urban Systems vol30 no 6 pp 712ndash725 2006

[20] G Ghinita P Kalnis A Khoshgozaran C Shahabi and K-LTan ldquoPrivate queries in location based services anonymizersare not necessaryrdquo in Proceedings of the ACM SIGMOD Interna-tional Conference on Management of Data (SIGMOD rsquo08) pp121ndash132 ACM 2008

[21] G Ghinita P KalnisM Kantarcioglu and E Bertino ldquoA hybridtechnique for private location-based queries with database

Mobile Information Systems 13

protectionrdquo in Advances in Spatial and Temporal DatabasesN Mamoulis T Seidl T B Pedersen K Torp and I AssentEds vol 5644 of Lecture Notes in Computer Science pp 98ndash116Springer Berlin Germany 2009

[22] L Sweeney ldquo119896-anonymity a model for protecting privacyrdquoInternational Journal of Uncertainty Fuzziness and Knowledge-Based Systems vol 10 no 5 pp 557ndash570 2002

[23] Generation Partnership Project 3GPP T TR 33821 V900(2009-06) 3GPP Rationale and Track of Security Decisionsin Long Term Evolved (LTE) RAN3GPP System ArchitectureEvolution (SAE) (Release 9)

[24] J Bou Abdo J Demerjian and H Chaouchi ldquoSecurity VSQos for LTE authentication and key agreement protocolrdquoInternational Journal of Network Security amp Its Applications vol4 no 5 pp 71ndash82 2012

[25] A Machanavajjhala D Kifer J Gehrke and M Venkitasub-ramaniam ldquoL-diversity privacy beyond k-anonymityrdquo ACMTransactions on Knowledge Discovery from Data vol 1 no 1article 3 2007

[26] N Li T Li and S Venkatasubramanian ldquot-closeness privacybeyond k-anonymity and l-diversityrdquo in Proceedings of the IEEE23rd International Conference on Data Engineering pp 106ndash115Istanbul Turkey April 2007

[27] X Liu K Liu L Guo X Li and Y Fang ldquoA game-theoreticapproach for achieving k-anonymity in Location Based Ser-vicesrdquo in Proceedings of the 32nd IEEE Conference on ComputerCommunications (IEEE INFOCOM rsquo13) pp 2985ndash2993 TurinItaly April 2013

[28] D Rebollo-Monedero J Forne and J Domingo-Ferrer ldquoFromt-closeness-like privacy to postrandomization via informationtheoryrdquo IEEE Transactions on Knowledge and Data Engineeringvol 22 no 11 pp 1623ndash1636 2010

[29] J B Abdo J Demerjian H Chaouchi T Atechian and CBassil ldquoPrivacy using mobile cloud computingrdquo in Proceedingsof the 5th International Conference on Digital Information andCommunication Technology and Its Applications (DICTAP rsquo15)pp 178ndash182 IEEE Beirut Lebanon May 2015

[30] J B Abdo J Demerjian H Chaouchi K Barbar andG PujolleldquoOperator centric mobile cloud architecturerdquo in Proceedings ofthe IEEE Wireless Communications and Networking Conference(WCNC rsquo14) pp 2982ndash2987 IEEE Istanbul Turkey April 2014

[31] Submitting Your Apple to the Store httpsdeveloperApplecomlibraryiosdocumentationIDEsConceptualAppDistri-butionGuideSubmittingYourAppSubmittingYourApphtml

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 2: Research Article Extended Privacy in Crowdsourced Location ...downloads.hindawi.com/journals/misy/2016/7867206.pdf · popular application o ering crowdsourced location-based service

2 Mobile Information Systems

Crowd

RequesterRequest

Reply

Crowdsourcingserver

Contributor

ContributorContributor

Contributor

Contributor

Figure 1 Crowdsourcing scenario

engines [5ndash7] Chacha [8] is a popular crowdsourcingapplication

(ii) ldquoCrowdsourced location-based servicerdquo [4] is amethodto fetch the recommendations about certain location-based categories posted by people with taste andinterest similar to the requester Foursquare [9] is apopular application offering crowdsourced location-based service (LBS)

In this paper we will be focusing on ldquocrowdsourcedlocation-based servicerdquo (crowdsourced LBS) but our workcould be extended to other types of applications Chorus[10] Foursquare [9 11] and CrowdSearcher [2] are threeexamples of crowdsourced LBS applications which requireuserrsquos location to be transmitted with the request in order toensure optimized task routing

Figure 2 shows a sample request sent to CrowdSearcherand the steps followed in the resolution mechanism Therequest is as follows find a good jobwith a suitable apartmentclose to a good drug rehabilitation center within my neigh-borhood (ie near my current location) The steps taken torespond to the above query are as follows

(1) The resolution mechanism searches for jobs suitablefor the requestorrsquos profile and houses offered for rentboth located in the requestorrsquos neighborhood

(2) The retrieved jobs and their offered salaries arecompared based on the crowdrsquos opinion The selectedcrowd has to have expertise in the requesterrsquos jobdomain and geographic location

(3) The drug rehabilitation centers close to the retrievedhouses are compared based on the crowdrsquos opinionThe second crowd has to have expertise in healthor social companionship and the studied geographiclocation

Resolution mechanism

Selectjob

Selecthouse

for rent

Ask foropinions

on job

Search fordrugrehabcenter

Ask forsuggestions

on rehabcenters

Query jobposition

features (role

Reply

Que

rym

echa

nism

Request

Reply

location etc)

Figure 2 Task division in crowdsourcing location-based services

RequesterQuery

Answer

Crowdsourcing serverNeighbor attacker

Insider attacker

Figure 3 Attacker model

(4) The ldquohouse-job-rehabilitation centerrdquo result havingthe highest crowd rate is sent back to the requestor

The square blocks in Figure 2 represent machine-basedtasks while the hexagon blocks represent crowd-based ones

As shown in the above scenario and Figure 2 Crowd-Searcher requires userrsquos location as part of the request in orderto filter out the crowd not belonging to the area the user isinterested in (since geographic location is one of the neededexpertise areas) The contributors (crowd) are not aware ofthe full query but respond only to small portions (eg Howdo you rate this rehab center) (last hexagon in Figure 2)

As userrsquos location is transmitted an attacker who islocated just before (or within) the crowdsourcing servercan capture the full query containing the requesterrsquos currentlocation as shown in Figure 3The attackermodel is discussedin Section 7

It is easy to relate userrsquos identity to the transmittedlocation and in turn to the request This identity-query dis-closure reveals private information about the userrsquos interestsaffiliations and future plans and possibly helps in trackinghim In our case the attacker can easily identify the requesterand deduce that he or somebody in his small family isa drug addict The attacker can sell this information tolocal drug dealers who can target the addict after leavingthe rehabilitation center The relapse rate (return back toaddiction after rehabilitation) is very high (40ndash60) [12 13]which makes the postaddict an easy prey

Crowdsourced LBS natively compromises the privacy ofall its users because it uses the same security mechanismsutilized by legacy LBS which contains well-known privacy-breaching vulnerabilities Research efforts were exerted to

Mobile Information Systems 3

LBSUser eNB

User identity ALocation 01001

Pseudonym ZDRFUser identity ZDRF

Location 01001

User identity ZDRFLocation 01001

User identity ZDRFRequested location

01110

User identity ZDRFRequested location 01110

Figure 4 Anonymization using pseudonym

ensure the privacy of legacy LBS users but all failed to preventidentity-query disclosure To the best of our knowledge nonehas taken advantage of crowdsourced LBSrsquos mobile Cloudnature to offer enhanced privacy

Matching userrsquos location to his queries has severe securityconsequences since location can be related to identity [14]and queries can be related to user preferences interests ideasand beliefs Being able to match userrsquos identity to preferencecan be used for customer profiling and behavior expectancyTyrant governments would be interested in matching queriesand posts countering their regimes to the identity andlocation of the activists Safeguarding this match is crucial tomaintain user privacy and sometimes user safety

We are going to show in this paper that all the advancedprivacy notions found in the literature used for anonymiza-tion are not suitable for LBS We are going to show also thatthe suitable privacy notions have proven vulnerabilities Wecan then deduce that location privacy in crowdsourced LBS isnotmet and could be easily breached thus the need for a newprivacy model is inevitable Finally we are going to propose anew privacymodel for crowdsourced LBS based on itsmobileCloud nature

The remainder of this paper is organized as followsSection 2 presents background information to make thispaper self-contained Section 3 surveys the latest and mostadvanced privacy notions found in the literature and provesthat none is suitable for crowdsourced LBS except 119870-anonymity Sections 4 5 and 6 discuss the ldquofrequency attackrdquowe developed against 119870-anonymity Section 4 shows themathematical model behind 119870-anonymity constraint whichis used to evaluate the privacy level of different locationprivacy preservingmechanisms (LPPMs) Section 5 proposesldquofrequency attackrdquo which is used to exploit the most secureLPPM ldquofootprintsrdquo Section 6 shows the simulation results ofldquofrequency attackrdquo to evaluate the privacy breach level

A new solution for this privacy issue is proposed in Sec-tion 7 by changing the computational environment FinallySection 8 concludes the paper

2 Related Work

In this section we present legacy LBS which is the prereq-uisite to understand crowdsourced LBS architecture and theused privacy preserving mechanisms We also present theused anonymization and privacy notion (119870-anonymity) Wethen survey the privacy requirements that should be main-tained by location privacy preserving mechanisms which arealso surveyed At the end of this section we survey the attackson119870-anonymity to show its vulnerabilities

21 Legacy Location-Based Services and 119870-AnonymityLocation-based service is a computer-level online servicethat utilizes the userrsquos current position as a critical inputfor the application providing this service [15] Locationcoordinates can be delivered through GPS equipped mobiledevices [16] or through the mobile userrsquos operator If amultilateration positioning technique is used in the servingnetwork then the exact location of the user can be specifiedIf multilateration is not used then only the distance to theserving eNB (evolved Node B) is delivered in addition tothe eNBrsquos coordinates in other words the user knows that itbelongs to the circumference of a circle centered at servingeNB its radius is the userrsquos distance to the center Note that

(i) location-dependent query is a user triggered requestto a location-based service

(ii) nearest neighbor query is a location-dependent queryrequesting the address of the nearest point of interest

Before being able to request any information fromlocation-based services the mobile user has to update hislocation by sending the coordinates to the LBS server whichin turn replies with the requested information Security of thelocation-related information transmitted over the air channelis considered to be outside the scope of this paper due to theimplemented confidentiality and integrity protection at theAccess Stratum (AS) layer We will only consider last mileeavesdropping (between PDN Gateway (P-GW) and LBSserver) carried out by outside attackers (neighbor attacks)or the service providers themselves (insider attacks) Captur-ing insecure identities and location information allows theattacker to breach the userrsquos privacy by being able to knowif the user is in a certain area and where precisely he is

Anonymization was proposed using pseudonym identi-ties as a cost-effective way to ensure identity privacy Identityanonymization is implemented at the mobile level wherea pseudonym is generated to replace the username in theLBS query as shown in Figure 4 Pseudonym anonymizationfailed to ensure the required degree of anonymity [17ndash19]because each user has a limited number of restricted areaswhich are known by the attacker and allow him to createa predetermined victim behavior These restricted areas areplaces visited regularly by the victim such as home officeand home-office road In other words the anonymizingtechnique is vulnerable to correlation attacks Although theusername is anonymized other remaining attributes calledquasi-identifiers can still be mapped to individuals (eg agesex and city [16 20 21])

4 Mobile Information Systems

Access point

Internet

Anonymity server

LBSLocationrequestanswer

Cloaked region

Figure 5 Anonymization using119870-anonymity

The scenario behind Figure 4 is that user A currentlyfound at location 01001 is interested in finding a certainservice which is returned by the LBS server as located at 01110

119870-anonymity is another proposedmethod to ensure loca-tion privacy where a location-dependent query is consideredprivate if the attacker is able to identify the requester withprobability less than 1119870 [22] 119870 is a threshold required bythe user [18]Thismethod uses an intermediate trusted servercalled anonymizer as shown in Figure 5

Every subscribed user has to register with a trustedanonymity server (anonymizer) by updating its identity andlocation This anonymizer maintains an updated databaseof userrsquos current location Each triggered query passes bythe anonymizer it replaces the userrsquos location by a cloakingregion (CR) and forwards the request to LBS server CRcontains in addition to the real user 119870 minus 1 users belongingto its neighborhood The attacker knows that one withinthis vicinity has requested this query but the probability ofidentifying the right user equals 1119870 The LBS server replieswith a list containing the identity of each user from CR withits corresponding point of interest (POI) (ie queryrsquos answer)The anonymizer then filters the POI corresponding to the realuser and passes it to the requester [17 18] 119870-anonymity alsohas drawbacks [17 18]

(i) The anonymizer is considered a single point of failureand bottleneck thus the probability of full outage ishigher than that in pseudonym anonymization

(ii) Malicious users can be physically located near thevictim thus the anonymizer will add these users inits CR Since the eavesdropper knows the malicioususers it can predict the userrsquos identity with probabilitygt 1119870

(iii) 119870-anonymity is also vulnerable to correlation attacks(iv) 119870-anonymityrsquos security level is directly proportional

to the frequency of usersrsquo location update It is notpractical and scalable to request location updatesperiodically from all users

(v) It is more profitable for idle users not to updatetheir location since moving from LTE IDLE toLTE ACTIVE to send this update message will causehigher battery consumption and excess signaling oncore level

(vi) Generated core traffic is 119870 minus 1 times more than whatis needed

(vii) Only identity and identity-query privacy are ensuredbut not location privacy

(viii) It is difficult to support continuous LBS

Crowdsourcing applications differ from legacy LBSs inthe resolution mechanism since one or more employeesrecruited from the crowd will participate in answering thequery but userrsquos interface and query mechanism (where theprivacy mechanisms lay) are slightly changed This makesthe privacy requirements and location privacy preservingmechanism that are used in legacy LBS remain valid incrowdsourcing applications

The privacy requirements applicable for legacy andcrowdsourced LBS are surveyed in the next subsection

22 Privacy Requirements of Crowdsourced LBS ApplicationsEvery crowdsourced LBS should maintain a set of privacyrequirements in order to be considered privacy preservingThese requirements are used to prevent the disclosure of sen-sitive information that could be used as part of more complexattacks The attacks range from tracking the user to profilinghis attitudes and interests LBSrsquos privacy requirements are asfollows

(i) Location privacy in crowdsourced LBS location-identity relationship is relatively easy to be exposedusing correlation attacks [14]

(ii) Identity privacy exposing a userrsquos nonrepudiatedidentity can expose him to location tracking byexploiting the well-known Evolved Packet System(EPS)Universal Mobile Telecommunication System(UMTS) tracking vulnerabilities [23 24]

(iii) Identity-query privacy LBS requests have no system-wide effect (ie wrong information will affect theuserrsquos request only without having side effects onother users) Identity traceback is not required

Location privacy preserving mechanisms (LPPMs) usedby LBS applications are surveyed in the next subsection

23 Location Privacy Preserving Mechanisms In this subsec-tion we survey the LPPMs that were proposed for legacy LBS[14] which are as follows

(i) Anonymization and obfuscation anonymization isreplacing the username part of the LBS query with

Mobile Information Systems 5

a pseudonym obfuscation is responsible for distort-ing the LBS queryrsquos second part (location) Variouspseudonym generation mechanisms are found suchas Mix zones [14]

(ii) Private Information Retrieval (PIR) both request andreply are encrypted leaving the server unable toidentify the sender or the request This mechanism issuitable for legacy LBS but impractical in crowdsourc-ing since the search engines need to send the crowda plaintext job

(iii) Feeling-based approach this is achieved by identi-fying a public region and sending a request withdisclosed location that must be at least as popular asthat space

24 Attacks on 119870-Anonymity In this subsection we surveythe attacks on 119870-anonymity To do so the following termsshould be defined first

(i) Quasi-identifiers a group of attributes that canuniquely identify an entity (eg age ZIP code)

(ii) Sensitive attributes information which results inprivacy breaching if matched to an entity

The term ldquocloaking region (CR)rdquo used in Section 2 isidentical to the term ldquoclassrdquo used by the authors of [25 26]who defined it as ldquoa set of records that are indistinguishablefrom each other with respect to certain identifying attributesrdquo[25]Wewill only use CR in this paper tomaintain coherence

Several attacks on 119870-anonymity have been discussed inthe literature which are as follows

(i) Selfish behavior based attack the authors of [27]showed the behavior of selfish users in ldquoanonymiza-tion and obfuscationrdquo LPPM and its effect on privacyprotection In the areas where the number of LBSusers is less than ldquo119896rdquo a requester can generate dummyusers so that its CR can again contain ldquo119896rdquo users(real and dummy) in order to achieve 119870-anonymityOther selfish users (free riders) can benefit from thegenerated dummy users to obfuscate their requestsThe cost of generating dummy users is not distributedfairly among the users benefiting from this LPPM

(ii) Homogeneity attack [25 26] this attack takes advan-tage of the fact that ldquo119870-anonymity creates groupsthat leak information due to lack of diversity in thesensitive attributerdquo [25] The attack expects that thesearched attribute (sensitive attribute) is the same forall the users within the sameCRAn example showingthis attack is implemented as follows consider themedical records of a certain population shown inTable 1 and its anonymized version published inTable 2 [25] Alice tries to find the medical situationof Bob who is included in Table 1 Bob is a 31-year-old Americanmale who lives in ZIP code 13053 Alicecan deduce that Bob has cancer since all the membersbelonging to his group (CR = 9 10 11 12) have thesame disease

Table 1 Medical records of a sample population

Nonsensitive (quasi-identifiers) SensitiveZIP code Age Nationality Situation

1 13053 28 Russian Heart disease2 13068 29 American Heart disease3 13068 21 Japanese Viral infection4 13053 23 American Viral infection5 14853 50 Indian Cancer6 14853 55 Russian Heart disease7 14850 47 American Viral infection8 14850 49 American Viral infection9 13053 31 American Cancer10 13053 37 Indian Cancer11 13068 36 Japanese Cancer12 13068 35 American Cancer

(iii) Background knowledge attack this attack takesadvantage of the fact that ldquo119870-anonymity does not pro-tect against attacks based on background knowledgerdquo[25] which are not included in the offered informa-tion An example showing this attack is implementedas follows Alice tries to know the medical situationof her friend Umeko who is admitted to the samehospital as Bob so his records are also shown inTable 1 Umeko is a 21-year-old Japanese female whocurrently lives in ZIP code 13068 thus belonging tothe first CR = 1 2 3 4 Alice can deduce thatUmeko has either viral infection or heart disease It iswell known that the Japanese have an extremely lowincidence of heart disease thus Alice is nearly surethat Umeko has a viral infection

As shown previously 119870-anonymity has various weak-nesses and it would be interesting if we are able to replaceit Advanced privacy notions have been proposed to replace119870-anonymity for microdata publishing It looks tempting toadopt these notions into LBS In the coming section we aregoing to show that none of the advanced privacy notionsfound in the literature except for119870-anonymity is suitable forcrowdsourced location-based services

3 119870-Anonymity 119897-Diversity and 119905-Closeness

119870-anonymity is considered not enough to ensure the privacyneeded for microdata publishing Microdata are tables thatcontain unaggregated information which include medicalvoter registration census and customer data They areusually used as a valuable source of information for theallocation of public funds medical research and trendanalysis Table 1 is a sample microdata table while Table 2is an anonymized microdata table Other stronger privacynotions were proposed in the literature which are 119897-diversity[25] and 119905-closeness [26 28]

The ldquo119897-diversityrdquo privacy notion states that a CR is consid-ered to have 119897-diversity if there are at least 119897 ldquowell-representedrdquovalues [25] ldquoWell-representedrdquo can be interpreted as follows

6 Mobile Information Systems

Table 2 Anonymized records of the sample population

Nonsensitive (quasi-identifiers) SensitiveZIP code Age Nationality Situation

1 130lowastlowast lt30 lowast Heart disease2 130lowastlowast lt30 lowast Heart disease3 130lowastlowast lt30 lowast Viral infection4 130lowastlowast lt30 lowast Viral infection5 1485lowast gt40 lowast Cancer6 1485lowast gt40 lowast Heart disease7 1485lowast gt40 lowast Viral infection8 1485lowast gt40 lowast Viral infection9 130lowastlowast 3lowast lowast Cancer10 130lowastlowast 3lowast lowast Cancer11 130lowastlowast 3lowast lowast Cancer12 130lowastlowast 3lowast lowast Cancerlowast in the ZIP code column refers to missing data lowastlowastmeans that 2 digits areanonymized lowast in the Nationality column is used to identify an anonymizednationality

(i) Distinct 119897-diversity it contains at least 119897 distinctvalues

(ii) Entropy 119897-diversity entropy of the values thatoccurred is greater than or equal to log 119897

(iii) Recursive 119897-diversity the most frequent value doesnot appear toomuch and the less frequent value doesnot appear too rarely

A table is considered to have 119897-diversity if all its CRs have119897-diversity Various attacks have been proposed against thisprivacy notion which results in privacy disclosure [26]Theseattacks are as follows

(i) Skewness attack [26] skewed distributions couldsatisfy 119897-diversity but still do not prevent attributedisclosure This happens when the frequency ofoccurrence of a certain sensitive attribute inside aCR differs from its frequency in the whole table(population) Consider the drug addiction rate in acertain population to be 1 One of the classes hasan addiction rate of 50 This class is consideredattribute disclosing since a member of this class isidentified as addict with a high probability

(ii) Similarity attack [26] distinct but semantically simi-lar sensitive values could satisfy 119897-diversity but meanthe same for the attacker If the sensitive informationshown in a table contains cocaine addict heroinaddict marijuana addict and so forth as distinctentries it could be considered satisfying 119897-diversitybut for a drug dealer all these entries are consideredas a target

The ldquo119905-closenessrdquo concept is another privacy notion thatdifferentiates between the information gained about thewhole population and that about specific individuals [26]As the first gain is tolerated and motivated the second isconsidered privacy breaching A class is considered to have

Table 3 Data stored at the LBS anonymizer

Quasi-identifiers Sensitive dataCloaking region Location Query True sender

CR1Location 1

Query 1No

Location 2 NoLocation 3 Yes

CR2Location 4

Query 2No

Location 5 YesLocation 6 No

Table 4 Data captured by the adversary

Quasi-identifiers Sensitive dataCloaking region Location Query True sender

CR1Location 1

Query 1lowast

Location 2 lowast

Location 3 lowast

CR2Location 4

Query 2lowast

Location 5 lowast

Location 6 lowast

lowast refers to an anonymized Boolean value (true or false)

ldquo119905-closenessrdquo if the distance between the distribution of asensitive attribute in the studied class and the distributionin the whole table (population) is less than or equal to athreshold 119905 The distance is measured using Earth MoverrsquosDistance (EMD)

It is shown in [26] that ldquo119905-closenessrdquo ensures higherprivacy than ldquo119897-diversityrdquo

31 119870-Anonymity versus 119897-Diversity and 119905-Closeness Thedifference in nature between microdata tables and LBS CRcaptured data is the main factor that prevents the implemen-tation of ldquo119897-diversityrdquo and ldquo119905-closenessrdquo as privacy metrics inLPPMs

Anonymizing microdata keeps the sensitive attributesintact and hides parts of the quasi-identifiers to preventuniquely identifying an entity An entity can only be iden-tified as an unknown record in a class ldquo119897-diversityrdquo and ldquo119905-closenessrdquo try to keep the relationship between the entities ofa class and its sensitive attributes as private as possible usingthe techniques discussed previously

LBS anonymization tries on the contrary to hide thesensitive data instead of generalizing the quasi-identifierswhich are needed by the location-based server (locationexpertise and query) to be able to respond to the userrsquos queryTable 3 represents the LBS data stored in the anonymizer

The anonymized data that are transmitted to the LBSserver and could be captured by the adversary are shown inTable 4

As can be seen in the microdata anonymization in Tables1 and 2 and LBS anonymization in Tables 3 and 4 the latterfocuses on hiding the sensitive data which totally negates theaim behind publishing microdata

Mobile Information Systems 7

Based on the above discussion we can deduce that 119897-diversity and 119905-closeness are not suitable for LBS anonymiza-tion thus 119870-anonymity remains the best suitable privacynotion suitable for such services

We have shown in the previous sections that LBS appli-cations are in need of privacy preserving mechanisms basedon valid privacy notions We have also shown that thealready implemented privacy notion119870-anonymity could beexploited using various attacks but stricter privacy notionsare not suitable We can conclude that LPPMs using 119870-anonymity are the best available solutions To prove the needfor a novel privacy concept we will present in the comingsections a newattack on119870-anonymity thatwehave developed[29]

4 Mathematical Model

We will prove in this section that 119870-anonymity is private ifldquozero prior knowledgerdquo is assumed and the adversary couldnot perform the attack for a long continuous period

Let 1199091 1199092 119909

119899 be the set of users found in a CR

where ldquo119899rdquo is its size and 119899 ge 1 Let ldquo119909rdquo be our investigateduser If 119909 notin 119909

1 1199092 119909

119899 that is 119909 notin CR then 119909 is for sure

not the requestor The probability of a user being added to aCR without being the true requestor is

119875 (119909 isin CR (119909119894= requestor amp 119909 = 119909

119894))

=119875 (119909 isin CR cap 119909

119894= requestor)

119875 (119909119894= requestor)

=119899 minus 1

119873 minus 1

(1)

where119873 is the number of users in the area controlled by thestudied anonymizer where 1 le 119899 le 119873 The probability of auser being added to a CR while being the true requestor is

119875 (119909 isin CR (119909 = requestor)) = 1 (2)

We can deduce that the probability of a user being addedto a CR is119875 (119909 isin CR)

=

119873

sum

119894=1

119875 (119909 isin CR119909119894= requestor) sdot 119875 (119909

119894= requestor)

(3)

If all the users have the same frequency of sendingrequests (ie 119909

119894are equiprobable ie 119875(119909

119894= requestor) =

1119873) we can deduce that119875 (119909 = requestor119909 isin CR)

=119875 (119909 = requestor119909 isin CR)

119875 (119909 isin CR)=

(1119873)

(119899119873)=

1

119899

(4)

Based on the assumption that all users are equiprobable(since the adversary has ldquozero prior knowledgerdquo and could notperform the attack for a long continuous period he cannotassume otherwise) 119870-anonymity is satisfied if the cloakingregion size is greater than119870 that is 119899 ge 119870 We will prove inthe coming section that the privacy offered by119870-anonymitycould be breached using our proposed attack (frequencyattack) if the adversary is able to perform the attack for a longperiod even if ldquozero prior knowledgerdquo is assumed

5 Frequency Attack

Until now we have considered that all users are equiprobablebut in reality they are not Consider an LBS which can beused by a user to retrieve the location of the nearest shop(bakery Chinese restaurant etc) While being home I amaware of the most important places which I usually accessand I know the bakeries barbers supermarkets and so forthwithin my town so I need no support in locating these placeswhile on the contrary I can help othersWhen visiting places Iamnew to it will be hard forme to find certain targets (metrosupermarket etc) especially when facing language problemsMy rate of asking for guidance at these new places is muchhigher than that while being atmy known region Let comfortzone be a set of areas where a certain user is familiar with andneeds no support from the studied LBS in finding places Auserrsquos comfort zone includes his hometown and workplaceOutside this comfort zone the query frequency of this userincreases

Based on the above discussion we can easily assumethat the usersrsquo frequencies are not equal that is exist119894119875(119909

119894=

requestor) = 1119873If a user ldquo119894rdquo is to be found in a CR then either he is the real

requestor or he is not and selected from the ldquo119873minus1rdquo remainingusers

If user ldquo119894rdquo is the real requestor (ie 119894 = 119909) then 119875(119909 isin

CR119909 = requestor) = 1 since the real requestor should be inthe CR else he will not receive a valid answer

If user ldquo119894rdquo is not the real requestor (ie 119894 = 119909) then 119875(119909 isin

CR119909 = requestor) = (119899minus1)(119873minus1) that is one of the ldquo119899minus1rdquoselected users from the ldquo119873 minus 1rdquo remaining users

Let 119875119894= 119875(119909

119894= requestor) then

119875 (119909 isin CR)

=

119873

sum

119894=1

119875 (119909 isin CR119909119894= requestor) sdot 119875 (119909

119894= requestor)

=

119873

sum

119894=1

119894 =119909

(119899 minus 1

119873 minus 1)119875119894+ (1) 119875

119909= (

119899 minus 1

119873 minus 1)

119873

sum

119894=1

119894 =119909

119875119894+ 119875119909

Since119873

sum

119894=1

119875119894= 1 then

119873

sum

119894=1

119894 =119909

119875119894= 1 minus 119875

119909

(5)

then

119875 (119909 isin CR) = (119899 minus 1

119873 minus 1) (1 minus 119875

119909) + 119875119909

=119899 minus 1

119873 minus 1+ (

119873 minus 119899

119873 minus 1)119875119909

(6)

let

119886 = (119873 minus 119899

119873 minus 1)

119887 = (119899 minus 1

119873 minus 1)

(7)

8 Mobile Information Systems

then

119875 (119909119894isin CR) = 119886119875

119894+ 119887 (8)

The adversary even if he has ldquozero prior knowledgerdquocan collect CRs used by the studied application through asilent passive attack These captured CRs help the attackerto estimate the requesting frequency of each user using thefollowing procedure

Let 119909119894119895be a random variable representing the occurrence

of user ldquo119894rdquo in the 119895th collected CR (CR119895) sdot 119909119894119895= 1 if user ldquo119894rdquo

(119909119894) is found in CR

119895and 119909

119894119895= 0 otherwise Considering that

the attacker has captured ldquo119898rdquo CRs the average number ofoccurrences of user ldquo119894rdquo in the collected CRs is

119883119894=

1

119898

119898

sum

119895=1

119909119894119895 (9)

We can estimate the population rate (the real rate of user119894) from the sample rate with a confidence interval as follows

119875 (119909119894isin CR) isin [119883

119894minus 120572119878radic

119898 minus 1

119898 119883119894+ 120572119878radic

119898 minus 1

119898] 997904rArr

119875 (119909119894isin CR)

isin

[119883119894minus 119905120572

119878radic119898 minus 1

119898 119883119894+ 119905120572

119878radic119898 minus 1

119898] if 119898 lt 30

[119883119894minus

120572119878

radic119898 119883119894+

120572119878

radic119898] elsewhere

(10)

ldquo119905rdquo follows Studentrsquos 119905-distribution of ldquo119898 minus 1rdquo degreesof freedom (using Bayesian prediction) and ldquo120572rdquo followsnormal distribution (using central limit theorem) We cannow deduce 119875

119894since

997904rArr 119875 (119909 = requestor119909 isin CR) =119875119909

sum119895isinCR 119875119895

=1

119899 (11)

We have shown in this section that even with ldquozeroprior knowledgerdquo the adversary is able to estimate the userrsquosrequest rate through a passive attack aiming to collect CRsThis attack takes advantage of the fact that not all users havethe same request rate that is each put a different amount oftime utilizing a certain application User request rate escalateswhen being outside of his comfort zone which helps indifferentiating him from other users with lower request rateAs the duration of this attack increases the attackerrsquos accuracyincreases Using the above estimations the adversary is ableto identify whether a user found in a CR is the real requestorwith a probability different than (greater or less than) ldquo1119899rdquowhich results in lower entropy (uncertainty) and thus betterpredictability and this breaches the119870-anonymity constraint

We will show in the coming section the entropy and rateof successful prediction of our proposed attack

6 Simulation Results

In this section we have simulated using a specially craftedC code 50 users within an anonymizerrsquos controlled zone

Rate of successful prediction using frequency attack

0

5

10

15

20

25

30

35

40

()

20 50 7010()

Estimated rate of successful prediction using K-anonymity

Figure 6 Prediction success rate for CR = 5

Rate of successful prediction using frequency attack

0

10

20

30

40

50

()

20 50 7010()

Estimated rate of successful prediction using K-anonymity

Figure 7 Prediction success rate for CR = 4

Each user has either low or high request rates Users with highrequest rate are considered outside their comfort zone andhave 10 times more requests than low rate users The privacylevel (entropy and rate of successful requestor prediction) isstudied for different CR sizes and for sufficient samples Thetraffic factor (high request ratelow request rate = 10) is notproved on any real application but assumed as a particularcase

In Figure 6 we can see the estimated rate of successfulprediction and the rates achieved by the frequency attack for acloaking region of size 5The119909-axis represents the proportionof the population with high request rates while the 119910-axisrepresents the average probability of predicting the requestorsuccessfully

It can be shown in Figure 6 that frequency attack hasincreased the prediction success compared to the estimatedrate by (30-20)20 = 50 when 70 of the populationare outside their comfort zone to (36-20)20 = 60when 20 of the population are outside their comfort zonedepending on the distribution of the requestors (percentageof users outside their comfort zone) In Figure 7 we cansee the estimated rate of successful prediction and the ratesachieved by the frequency attack for a cloaking region of size4 The 119909-axis and 119910-axis are similar to those in Figure 6

Mobile Information Systems 9

Entropy of the samples with CR = 4 using frequency attackEntropy of the samples with CR = 5 using frequency attack

3

32

34

36

38

4

20 50 7010()

Entropy of equiprobable users as estimated by K-anonymity

Figure 8 Entropy for frequency attack (CR = 4 CR = 5) andequiprobable users

It can be shown in Figure 7 that frequency attack hasincreased the prediction success compared to the estimatedrate by (34-25)25 = 30 when 70 of the populationare outside their comfort zone to (44-25)25 = 20when 20 of the population are outside their comfort zonedepending on the distribution of the requestors (percentageof users outside their comfort zone) In Figure 8 we cansee the entropy (uncertainty) estimated by 119870-anonymityThe 119909-axis represents the proportion of the population withhigh request rates while the 119910-axis represents mechanismrsquosentropy

It can be shown in Figure 8 that the entropy (uncertaintyof knowing the requestor) is lower when using frequencyattack for both CR sizes This means that the attacker is ableto estimate the real requestor with a probability greater thanldquo1119899rdquo which leads to a breach in the ldquo119870-anonymityrdquo privacyconstraint

We have shown in this section that our proposed attackis able to breach119870-anonymity with an incremental rate as theduration of this attack increases

Since we are not able to find any privacy notion that iscapable of ensuring elevated user privacy in crowdsourcedLBS we deduce that a solid privacy solution is needed andthis will be proposed in the coming section

7 The Proposed Solution

Ensuring high privacy level in crowdsourced LBS is not trivialin the following environments (attacker models)

(i) Untrusted service provider in most of the cases theuser has no signed contract with the crowdsourcedlocation-based server thus no privacy obligations areheld especially when no auditing is taking place

(ii) Hostile network crowdsourced LBSs have no meansof managing incoming trafficrsquos privacy level otherthan encryption

(iii) Untrusted service provider in hostile network mostof the LBSs are untrusted and connected to the Inter-net (hostile network) but enforce encrypted trafficThis attackermodel which is currently used results intwo attack categories (insider and neighbor) as shownin Figure 3 Figure 9 shows untrusted service providerin a hostile network

The best privacy can be achieved by a trusted serviceprovider when located within a trusted network In this casethere is no need to use LPPMs or anonymization since theattacker is neither able to access the sent queries (neighborattack) nor able to access the log files (insider attack)Any of the LPPMs found in the literature and surveyed inSection 2 generates both processing and traffic overheadthus eliminating the need for these mechanisms withoutaffecting the privacy level is definitely an achievement interms of security and performance

Since crowdsourced LBS users are in reality mobile usersour proposed solution for this privacy issue is based onour Mobile Cloud Computing architecture named OperatorCentric Mobile Cloud Architecture (OCMCA) [30] We aregoing next to describe briefly OCMCA

71 Operator Centric Mobile Cloud Architecture To decreasethe delay cost and power consumption and increase theprivacy mobility and scalability compared to other mobileCloud architectures we proposed to install a ldquoCloud serverrdquowithin the mobile operatorrsquos network as shown in Figure 10Its proposed position leads to the following

(i) Less delay the traffic does not need to pass throughthe Internet to reach the destined server

(ii) Less cost the same data contents could be deliv-ered to various users using 1 multicast channelwhich decreases congestion (scalability) and allowedcheaper pricing models

(iii) Less power consumption userrsquos jobs are computed atthe Cloud server which eliminates the need for in-house execution

(iv) More mobility a user can maintain a session with theCloud server as long as he is connected to the mobilenetwork Our mobile Cloud architecture benefitsfrom the handover mechanisms in place

(v) More scalability the Cloud servers are centralizedand accessible by all the users

(vi) More privacy the Cloud servers are managed by atrusted and secure entity as it will be shown in moredetail in the next subsection

Since users are usually connected with the same operatorfor long durations we recommend the userrsquos CSP to federatesome resources at the ldquoCloud serverrdquo and then offload theuserrsquos applications and environment settings In this caseall computation-intensive processing is implemented within

10 Mobile Information Systems

Serving GWPDN GW

Cellphone (mobile)eNodeB

MME

LBS

Hostile network Untrusted service provider

Internet

Figure 9 Untrusted service provider in hostile network

Serving GW

MBMS GW

BM-SC

PDN GW

Cloud server

Cellphone (mobile)

Smartphone (mobile)

eNodeB

MME

Trusted network Server under trusted monitoring (trusted server)

Figure 10 OCMCA

the mobile operatorrsquos network and the terminal generateddata are offloaded to the local Cloud without the need toaccess the Internet (which decreases the cost per bit ofthe transmitted data) This Cloud should be able to triggerbroadcast messages when needed 3GPP has standardizedmulticast and broadcast packet transmission in UMTS andbroadcast transmission in LTE through a feature namedMultimedia MulticastBroadcast Service (MBMS) which wasdefined in 3GPPrsquos technical specification as follows

(i) ldquoMBMS is a point-to-multipoint service inwhich datais transmitted from a single source entity to multiplerecipientsrdquo [30]

(ii) Physical broadcasting allows network resources to beshared when transmitting the same data to multiplerecipients [30]

(iii) Its architecture ensures efficient usage of radio-network and core-network resources especially inradio interface [30]

MBMS requires the introduction of two nodes to theEvolved Packet Core (EPC) network [30] which are asfollows

(i) MBMS GW which is responsible for the connectionswith content owners Cloud servers in our case

(ii) BM-SC (Broadcast Multicast Service Center) whichprovides a set of functions for MBMS user services

If MBMS is already enabled in a studied mobile operatorthen MBMS GW and BM-SC are already installed Enablingbroadcast in LTE allows the local Cloud located within themobile operatorrsquos premises to transmit data efficiently to theusers of one or more cells

Mobile Information Systems 11

InternetCellphone(mobile)

Smartphone(mobile)

Destination web callee at other operators

normal Cloud server and so fortheNodeB

Cloud server

BM-SC

Smartphone(mobile)

eNodeB

MBMS GW

MME

Serving GW

PDN GW

SGi

Smartphone(mobile)

Smartphone(mobile)

Normal mobile trafficUnicast Cloud trafficMulticast Cloud traffic

Figure 11 Traffic routes in OCMCA

The added nodes (MBMSGWBM-SC andCloud server)are transparent to normal mobile traffic Mobile traffic willpass as follows

(i) Normal mobile traffic eNodeB Serving GW andPacket Data Network GateWay (PDNGW) shown inFigure 11

(ii) Unicast Cloud traffic eNodeB Serving GW ldquoCloudserverrdquo Serving GW and eNodeB shown in Figure 11

(iii) Multicast Cloud traffic eNodeB Serving GW ldquoCloudserverrdquo BM-SC and MBMS GW shown in Figure 11

72 Privacy Guaranteeing Architecture One of the ldquoCloudserverrsquosrdquo capabilities is to offer Software-as-a-Service (SaaS)on behalf of application developers An application developercan request the operator to host his service (application)After reviewing the source code the operator agrees onhosting this service and signs an SLA with the developer

A similar protocol is implemented byApple before addingany application to Apple Store [31]The application developertrusts Apple for not spreading the source code and for billing(charging every downloaded instance)

In our case the LBS provider will delegate his applicationto be provided by the mobile operator on his behalf as shownin Figure 12

As can be seen in Figure 12 the LBS query does notneed to pass in a hostile network which eliminates neighborattacks Since the offered service is under the operatorrsquosmonitoring the provider becomes trusted and insider attacks

become eliminated The application verification and billingmethods suitable for the operator could be used withoutaffecting the proposed solution

Since LPPMs have been proposed to prevent an attackerwho captured LBS queries from identifying the requestor ortracing the users we can consider that LPPMs are not neededanymore in this scenario since attackers are not capable ofcapturing the transmitted requests Based on this discussionwe consider the following

(i) User identity privacy is maintained The operatorhas access to userrsquos real identity (IMSI MSISDNand name) and generates temporary identities (TMSIGUTI etc) to prevent real identity breach Userrsquosidentity privacy is maintained since the operator isconsidered a trusted entity but in all cases it alreadyhas access to this information thus our proposedsolution does not offer any new information that theoperator could benefit from

(ii) User location privacy is maintained The operatoralready has access to any userrsquos location and does notbenefit from additional sensitive information leadingto a transparent and strong location privacy solutionfor the user

8 Conclusion

User privacy in crowdsourced LBS is unacceptable in itscurrent state Available location privacy preserving mecha-nisms result in unnecessary overhead without being able to

12 Mobile Information Systems

Serving GW

MBMS GWBM-SC

PDN GW

Cloud server

Cellphone (mobile)

Smartphone (mobile)

eNodeB

MME

Trusted networkServer under trusted monitoring (trusted

server)

LBS

Delegatedservices

Figure 12 Crowdsourced LBS delegation

satisfy the needed requirementsWe have shown that even thestate-of-the-art approach such as119870-anonymity is not suitableto fulfill the required privacy needs as it is vulnerable toour defined frequency attack Thus a new privacy model isneeded and this is what we have proposed in this paper

The delegation of services into the Cloud server inside theoperatorrsquos trusted network prevents all the attacks proposedby the attacker model shown in Figure 3

Competing Interests

The authors declare that they have no competing interests

References

[1] Google Cloud httpsCloudgooglecom[2] A Bozzon M Brambilla and S Ceri ldquoAnswering search

queries with CrowdSearcherrdquo in Proceedings of the 21st AnnualConference on World Wide Web (WWW rsquo12) pp 1009ndash1018ACM Perth Australia April 2012

[3] B L Bayus ldquoCrowdsourcing new product ideas over timean analysis of the Dell IdeaStorm communityrdquo ManagementScience vol 59 no 1 pp 226ndash244 2013

[4] J Howe Crowdsourcing How the Power of the Crowd Is Drivingthe Future of Business Random House New York NY USA2008

[5] M F Bulut Y S Yilmaz and M Demirbas ldquoCrowdsourcinglocation-based queriesrdquo in Proceedings of the 9th IEEE Interna-tional Conference on Pervasive Computing and CommunicationsWorkshops (PERCOM Workshops rsquo11) pp 513ndash518 SeattleWash USA March 2011

[6] D Horowitz and S D Kamvar ldquoThe anatomy of a large-scalesocial search enginerdquo in Proceedings of the 19th InternationalWorld Wide Web Conference (WWW rsquo10) pp 431ndash440 RaleighNC USA April 2010

[7] G Chatzimilioudis A Konstantinidis C Laoudias and DZeinalipour-Yazti ldquoCrowdsourcing with smartphonesrdquo IEEEInternet Computing vol 16 no 5 pp 36ndash44 2012

[8] Chachacom httpwwwchachacom[9] Foursquarecom httpsfoursquarecom

[10] W S Lasecki RWesley J Nichols A Kulkarni J F Allen and JP Bigham ldquoChorus a crowd-powered conversational assistantrdquoin Proceedings of the 26th Annual ACM Symposium on UserInterface Software and Technology (UIST rsquo13) pp 151ndash162 StAndrews UK October 2013

[11] A Noulas S Scellato C Mascolo and M Pontil ldquoAn empiricalstudy of geographic user activity patterns in foursquarerdquo inProceedings of the International Conference on Weblogs andSocial Media AAAI Barcelona Spain July 2011

[12] How effective drug addiction treatment httpwwwdrugabusegovpublicationsprinciples-drug-addiction-treatment-re-search-based-guide-third-editionfrequently-asked-questionshow-effective-drug-addiction-treatment

[13] Drug Rehab Statistics httpwwwfuturesofpalmbeachcomdrug-rehabstatistics

[14] J Bou Abdo H Chaouchi and J Demerjian ldquoSecurity inemerging 4G networksrdquo in Next-Generation Wireless Technolo-gies 4G and Beyond pp 243ndash272 Springer London UK 2013

[15] Location-Based Service httpenwikipediaorgwikiLocation-based service

[16] M E Nergiz M Atzori and Y Saygin ldquoTowards trajectoryanonymization a generalization-based approachrdquo in Proceed-ings of the SIGSPATIAL ACM GIS International Workshop onSecurity and Privacy in GIS and LBS (SPRINGL rsquo08) pp 52ndash61November 2008

[17] S SteinigerMNeun andA Edwardes Foundations of LocationBased Services Lecture Notes on LBS v 10 2006

[18] D Quercia N Lathia F Calabrese G Di Lorenzo and JCrowcroft ldquoRecommending social events from mobile phonelocation datardquo in Proceedings of the IEEE 10th InternationalConference on Data Mining (ICDM rsquo 10) pp 971ndash976 IEEESydney Australia 2010

[19] B Jiang and X Yao ldquoLocation-based services and GIS inperspectiverdquo Computers Environment and Urban Systems vol30 no 6 pp 712ndash725 2006

[20] G Ghinita P Kalnis A Khoshgozaran C Shahabi and K-LTan ldquoPrivate queries in location based services anonymizersare not necessaryrdquo in Proceedings of the ACM SIGMOD Interna-tional Conference on Management of Data (SIGMOD rsquo08) pp121ndash132 ACM 2008

[21] G Ghinita P KalnisM Kantarcioglu and E Bertino ldquoA hybridtechnique for private location-based queries with database

Mobile Information Systems 13

protectionrdquo in Advances in Spatial and Temporal DatabasesN Mamoulis T Seidl T B Pedersen K Torp and I AssentEds vol 5644 of Lecture Notes in Computer Science pp 98ndash116Springer Berlin Germany 2009

[22] L Sweeney ldquo119896-anonymity a model for protecting privacyrdquoInternational Journal of Uncertainty Fuzziness and Knowledge-Based Systems vol 10 no 5 pp 557ndash570 2002

[23] Generation Partnership Project 3GPP T TR 33821 V900(2009-06) 3GPP Rationale and Track of Security Decisionsin Long Term Evolved (LTE) RAN3GPP System ArchitectureEvolution (SAE) (Release 9)

[24] J Bou Abdo J Demerjian and H Chaouchi ldquoSecurity VSQos for LTE authentication and key agreement protocolrdquoInternational Journal of Network Security amp Its Applications vol4 no 5 pp 71ndash82 2012

[25] A Machanavajjhala D Kifer J Gehrke and M Venkitasub-ramaniam ldquoL-diversity privacy beyond k-anonymityrdquo ACMTransactions on Knowledge Discovery from Data vol 1 no 1article 3 2007

[26] N Li T Li and S Venkatasubramanian ldquot-closeness privacybeyond k-anonymity and l-diversityrdquo in Proceedings of the IEEE23rd International Conference on Data Engineering pp 106ndash115Istanbul Turkey April 2007

[27] X Liu K Liu L Guo X Li and Y Fang ldquoA game-theoreticapproach for achieving k-anonymity in Location Based Ser-vicesrdquo in Proceedings of the 32nd IEEE Conference on ComputerCommunications (IEEE INFOCOM rsquo13) pp 2985ndash2993 TurinItaly April 2013

[28] D Rebollo-Monedero J Forne and J Domingo-Ferrer ldquoFromt-closeness-like privacy to postrandomization via informationtheoryrdquo IEEE Transactions on Knowledge and Data Engineeringvol 22 no 11 pp 1623ndash1636 2010

[29] J B Abdo J Demerjian H Chaouchi T Atechian and CBassil ldquoPrivacy using mobile cloud computingrdquo in Proceedingsof the 5th International Conference on Digital Information andCommunication Technology and Its Applications (DICTAP rsquo15)pp 178ndash182 IEEE Beirut Lebanon May 2015

[30] J B Abdo J Demerjian H Chaouchi K Barbar andG PujolleldquoOperator centric mobile cloud architecturerdquo in Proceedings ofthe IEEE Wireless Communications and Networking Conference(WCNC rsquo14) pp 2982ndash2987 IEEE Istanbul Turkey April 2014

[31] Submitting Your Apple to the Store httpsdeveloperApplecomlibraryiosdocumentationIDEsConceptualAppDistri-butionGuideSubmittingYourAppSubmittingYourApphtml

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 3: Research Article Extended Privacy in Crowdsourced Location ...downloads.hindawi.com/journals/misy/2016/7867206.pdf · popular application o ering crowdsourced location-based service

Mobile Information Systems 3

LBSUser eNB

User identity ALocation 01001

Pseudonym ZDRFUser identity ZDRF

Location 01001

User identity ZDRFLocation 01001

User identity ZDRFRequested location

01110

User identity ZDRFRequested location 01110

Figure 4 Anonymization using pseudonym

ensure the privacy of legacy LBS users but all failed to preventidentity-query disclosure To the best of our knowledge nonehas taken advantage of crowdsourced LBSrsquos mobile Cloudnature to offer enhanced privacy

Matching userrsquos location to his queries has severe securityconsequences since location can be related to identity [14]and queries can be related to user preferences interests ideasand beliefs Being able to match userrsquos identity to preferencecan be used for customer profiling and behavior expectancyTyrant governments would be interested in matching queriesand posts countering their regimes to the identity andlocation of the activists Safeguarding this match is crucial tomaintain user privacy and sometimes user safety

We are going to show in this paper that all the advancedprivacy notions found in the literature used for anonymiza-tion are not suitable for LBS We are going to show also thatthe suitable privacy notions have proven vulnerabilities Wecan then deduce that location privacy in crowdsourced LBS isnotmet and could be easily breached thus the need for a newprivacy model is inevitable Finally we are going to propose anew privacymodel for crowdsourced LBS based on itsmobileCloud nature

The remainder of this paper is organized as followsSection 2 presents background information to make thispaper self-contained Section 3 surveys the latest and mostadvanced privacy notions found in the literature and provesthat none is suitable for crowdsourced LBS except 119870-anonymity Sections 4 5 and 6 discuss the ldquofrequency attackrdquowe developed against 119870-anonymity Section 4 shows themathematical model behind 119870-anonymity constraint whichis used to evaluate the privacy level of different locationprivacy preservingmechanisms (LPPMs) Section 5 proposesldquofrequency attackrdquo which is used to exploit the most secureLPPM ldquofootprintsrdquo Section 6 shows the simulation results ofldquofrequency attackrdquo to evaluate the privacy breach level

A new solution for this privacy issue is proposed in Sec-tion 7 by changing the computational environment FinallySection 8 concludes the paper

2 Related Work

In this section we present legacy LBS which is the prereq-uisite to understand crowdsourced LBS architecture and theused privacy preserving mechanisms We also present theused anonymization and privacy notion (119870-anonymity) Wethen survey the privacy requirements that should be main-tained by location privacy preserving mechanisms which arealso surveyed At the end of this section we survey the attackson119870-anonymity to show its vulnerabilities

21 Legacy Location-Based Services and 119870-AnonymityLocation-based service is a computer-level online servicethat utilizes the userrsquos current position as a critical inputfor the application providing this service [15] Locationcoordinates can be delivered through GPS equipped mobiledevices [16] or through the mobile userrsquos operator If amultilateration positioning technique is used in the servingnetwork then the exact location of the user can be specifiedIf multilateration is not used then only the distance to theserving eNB (evolved Node B) is delivered in addition tothe eNBrsquos coordinates in other words the user knows that itbelongs to the circumference of a circle centered at servingeNB its radius is the userrsquos distance to the center Note that

(i) location-dependent query is a user triggered requestto a location-based service

(ii) nearest neighbor query is a location-dependent queryrequesting the address of the nearest point of interest

Before being able to request any information fromlocation-based services the mobile user has to update hislocation by sending the coordinates to the LBS server whichin turn replies with the requested information Security of thelocation-related information transmitted over the air channelis considered to be outside the scope of this paper due to theimplemented confidentiality and integrity protection at theAccess Stratum (AS) layer We will only consider last mileeavesdropping (between PDN Gateway (P-GW) and LBSserver) carried out by outside attackers (neighbor attacks)or the service providers themselves (insider attacks) Captur-ing insecure identities and location information allows theattacker to breach the userrsquos privacy by being able to knowif the user is in a certain area and where precisely he is

Anonymization was proposed using pseudonym identi-ties as a cost-effective way to ensure identity privacy Identityanonymization is implemented at the mobile level wherea pseudonym is generated to replace the username in theLBS query as shown in Figure 4 Pseudonym anonymizationfailed to ensure the required degree of anonymity [17ndash19]because each user has a limited number of restricted areaswhich are known by the attacker and allow him to createa predetermined victim behavior These restricted areas areplaces visited regularly by the victim such as home officeand home-office road In other words the anonymizingtechnique is vulnerable to correlation attacks Although theusername is anonymized other remaining attributes calledquasi-identifiers can still be mapped to individuals (eg agesex and city [16 20 21])

4 Mobile Information Systems

Access point

Internet

Anonymity server

LBSLocationrequestanswer

Cloaked region

Figure 5 Anonymization using119870-anonymity

The scenario behind Figure 4 is that user A currentlyfound at location 01001 is interested in finding a certainservice which is returned by the LBS server as located at 01110

119870-anonymity is another proposedmethod to ensure loca-tion privacy where a location-dependent query is consideredprivate if the attacker is able to identify the requester withprobability less than 1119870 [22] 119870 is a threshold required bythe user [18]Thismethod uses an intermediate trusted servercalled anonymizer as shown in Figure 5

Every subscribed user has to register with a trustedanonymity server (anonymizer) by updating its identity andlocation This anonymizer maintains an updated databaseof userrsquos current location Each triggered query passes bythe anonymizer it replaces the userrsquos location by a cloakingregion (CR) and forwards the request to LBS server CRcontains in addition to the real user 119870 minus 1 users belongingto its neighborhood The attacker knows that one withinthis vicinity has requested this query but the probability ofidentifying the right user equals 1119870 The LBS server replieswith a list containing the identity of each user from CR withits corresponding point of interest (POI) (ie queryrsquos answer)The anonymizer then filters the POI corresponding to the realuser and passes it to the requester [17 18] 119870-anonymity alsohas drawbacks [17 18]

(i) The anonymizer is considered a single point of failureand bottleneck thus the probability of full outage ishigher than that in pseudonym anonymization

(ii) Malicious users can be physically located near thevictim thus the anonymizer will add these users inits CR Since the eavesdropper knows the malicioususers it can predict the userrsquos identity with probabilitygt 1119870

(iii) 119870-anonymity is also vulnerable to correlation attacks(iv) 119870-anonymityrsquos security level is directly proportional

to the frequency of usersrsquo location update It is notpractical and scalable to request location updatesperiodically from all users

(v) It is more profitable for idle users not to updatetheir location since moving from LTE IDLE toLTE ACTIVE to send this update message will causehigher battery consumption and excess signaling oncore level

(vi) Generated core traffic is 119870 minus 1 times more than whatis needed

(vii) Only identity and identity-query privacy are ensuredbut not location privacy

(viii) It is difficult to support continuous LBS

Crowdsourcing applications differ from legacy LBSs inthe resolution mechanism since one or more employeesrecruited from the crowd will participate in answering thequery but userrsquos interface and query mechanism (where theprivacy mechanisms lay) are slightly changed This makesthe privacy requirements and location privacy preservingmechanism that are used in legacy LBS remain valid incrowdsourcing applications

The privacy requirements applicable for legacy andcrowdsourced LBS are surveyed in the next subsection

22 Privacy Requirements of Crowdsourced LBS ApplicationsEvery crowdsourced LBS should maintain a set of privacyrequirements in order to be considered privacy preservingThese requirements are used to prevent the disclosure of sen-sitive information that could be used as part of more complexattacks The attacks range from tracking the user to profilinghis attitudes and interests LBSrsquos privacy requirements are asfollows

(i) Location privacy in crowdsourced LBS location-identity relationship is relatively easy to be exposedusing correlation attacks [14]

(ii) Identity privacy exposing a userrsquos nonrepudiatedidentity can expose him to location tracking byexploiting the well-known Evolved Packet System(EPS)Universal Mobile Telecommunication System(UMTS) tracking vulnerabilities [23 24]

(iii) Identity-query privacy LBS requests have no system-wide effect (ie wrong information will affect theuserrsquos request only without having side effects onother users) Identity traceback is not required

Location privacy preserving mechanisms (LPPMs) usedby LBS applications are surveyed in the next subsection

23 Location Privacy Preserving Mechanisms In this subsec-tion we survey the LPPMs that were proposed for legacy LBS[14] which are as follows

(i) Anonymization and obfuscation anonymization isreplacing the username part of the LBS query with

Mobile Information Systems 5

a pseudonym obfuscation is responsible for distort-ing the LBS queryrsquos second part (location) Variouspseudonym generation mechanisms are found suchas Mix zones [14]

(ii) Private Information Retrieval (PIR) both request andreply are encrypted leaving the server unable toidentify the sender or the request This mechanism issuitable for legacy LBS but impractical in crowdsourc-ing since the search engines need to send the crowda plaintext job

(iii) Feeling-based approach this is achieved by identi-fying a public region and sending a request withdisclosed location that must be at least as popular asthat space

24 Attacks on 119870-Anonymity In this subsection we surveythe attacks on 119870-anonymity To do so the following termsshould be defined first

(i) Quasi-identifiers a group of attributes that canuniquely identify an entity (eg age ZIP code)

(ii) Sensitive attributes information which results inprivacy breaching if matched to an entity

The term ldquocloaking region (CR)rdquo used in Section 2 isidentical to the term ldquoclassrdquo used by the authors of [25 26]who defined it as ldquoa set of records that are indistinguishablefrom each other with respect to certain identifying attributesrdquo[25]Wewill only use CR in this paper tomaintain coherence

Several attacks on 119870-anonymity have been discussed inthe literature which are as follows

(i) Selfish behavior based attack the authors of [27]showed the behavior of selfish users in ldquoanonymiza-tion and obfuscationrdquo LPPM and its effect on privacyprotection In the areas where the number of LBSusers is less than ldquo119896rdquo a requester can generate dummyusers so that its CR can again contain ldquo119896rdquo users(real and dummy) in order to achieve 119870-anonymityOther selfish users (free riders) can benefit from thegenerated dummy users to obfuscate their requestsThe cost of generating dummy users is not distributedfairly among the users benefiting from this LPPM

(ii) Homogeneity attack [25 26] this attack takes advan-tage of the fact that ldquo119870-anonymity creates groupsthat leak information due to lack of diversity in thesensitive attributerdquo [25] The attack expects that thesearched attribute (sensitive attribute) is the same forall the users within the sameCRAn example showingthis attack is implemented as follows consider themedical records of a certain population shown inTable 1 and its anonymized version published inTable 2 [25] Alice tries to find the medical situationof Bob who is included in Table 1 Bob is a 31-year-old Americanmale who lives in ZIP code 13053 Alicecan deduce that Bob has cancer since all the membersbelonging to his group (CR = 9 10 11 12) have thesame disease

Table 1 Medical records of a sample population

Nonsensitive (quasi-identifiers) SensitiveZIP code Age Nationality Situation

1 13053 28 Russian Heart disease2 13068 29 American Heart disease3 13068 21 Japanese Viral infection4 13053 23 American Viral infection5 14853 50 Indian Cancer6 14853 55 Russian Heart disease7 14850 47 American Viral infection8 14850 49 American Viral infection9 13053 31 American Cancer10 13053 37 Indian Cancer11 13068 36 Japanese Cancer12 13068 35 American Cancer

(iii) Background knowledge attack this attack takesadvantage of the fact that ldquo119870-anonymity does not pro-tect against attacks based on background knowledgerdquo[25] which are not included in the offered informa-tion An example showing this attack is implementedas follows Alice tries to know the medical situationof her friend Umeko who is admitted to the samehospital as Bob so his records are also shown inTable 1 Umeko is a 21-year-old Japanese female whocurrently lives in ZIP code 13068 thus belonging tothe first CR = 1 2 3 4 Alice can deduce thatUmeko has either viral infection or heart disease It iswell known that the Japanese have an extremely lowincidence of heart disease thus Alice is nearly surethat Umeko has a viral infection

As shown previously 119870-anonymity has various weak-nesses and it would be interesting if we are able to replaceit Advanced privacy notions have been proposed to replace119870-anonymity for microdata publishing It looks tempting toadopt these notions into LBS In the coming section we aregoing to show that none of the advanced privacy notionsfound in the literature except for119870-anonymity is suitable forcrowdsourced location-based services

3 119870-Anonymity 119897-Diversity and 119905-Closeness

119870-anonymity is considered not enough to ensure the privacyneeded for microdata publishing Microdata are tables thatcontain unaggregated information which include medicalvoter registration census and customer data They areusually used as a valuable source of information for theallocation of public funds medical research and trendanalysis Table 1 is a sample microdata table while Table 2is an anonymized microdata table Other stronger privacynotions were proposed in the literature which are 119897-diversity[25] and 119905-closeness [26 28]

The ldquo119897-diversityrdquo privacy notion states that a CR is consid-ered to have 119897-diversity if there are at least 119897 ldquowell-representedrdquovalues [25] ldquoWell-representedrdquo can be interpreted as follows

6 Mobile Information Systems

Table 2 Anonymized records of the sample population

Nonsensitive (quasi-identifiers) SensitiveZIP code Age Nationality Situation

1 130lowastlowast lt30 lowast Heart disease2 130lowastlowast lt30 lowast Heart disease3 130lowastlowast lt30 lowast Viral infection4 130lowastlowast lt30 lowast Viral infection5 1485lowast gt40 lowast Cancer6 1485lowast gt40 lowast Heart disease7 1485lowast gt40 lowast Viral infection8 1485lowast gt40 lowast Viral infection9 130lowastlowast 3lowast lowast Cancer10 130lowastlowast 3lowast lowast Cancer11 130lowastlowast 3lowast lowast Cancer12 130lowastlowast 3lowast lowast Cancerlowast in the ZIP code column refers to missing data lowastlowastmeans that 2 digits areanonymized lowast in the Nationality column is used to identify an anonymizednationality

(i) Distinct 119897-diversity it contains at least 119897 distinctvalues

(ii) Entropy 119897-diversity entropy of the values thatoccurred is greater than or equal to log 119897

(iii) Recursive 119897-diversity the most frequent value doesnot appear toomuch and the less frequent value doesnot appear too rarely

A table is considered to have 119897-diversity if all its CRs have119897-diversity Various attacks have been proposed against thisprivacy notion which results in privacy disclosure [26]Theseattacks are as follows

(i) Skewness attack [26] skewed distributions couldsatisfy 119897-diversity but still do not prevent attributedisclosure This happens when the frequency ofoccurrence of a certain sensitive attribute inside aCR differs from its frequency in the whole table(population) Consider the drug addiction rate in acertain population to be 1 One of the classes hasan addiction rate of 50 This class is consideredattribute disclosing since a member of this class isidentified as addict with a high probability

(ii) Similarity attack [26] distinct but semantically simi-lar sensitive values could satisfy 119897-diversity but meanthe same for the attacker If the sensitive informationshown in a table contains cocaine addict heroinaddict marijuana addict and so forth as distinctentries it could be considered satisfying 119897-diversitybut for a drug dealer all these entries are consideredas a target

The ldquo119905-closenessrdquo concept is another privacy notion thatdifferentiates between the information gained about thewhole population and that about specific individuals [26]As the first gain is tolerated and motivated the second isconsidered privacy breaching A class is considered to have

Table 3 Data stored at the LBS anonymizer

Quasi-identifiers Sensitive dataCloaking region Location Query True sender

CR1Location 1

Query 1No

Location 2 NoLocation 3 Yes

CR2Location 4

Query 2No

Location 5 YesLocation 6 No

Table 4 Data captured by the adversary

Quasi-identifiers Sensitive dataCloaking region Location Query True sender

CR1Location 1

Query 1lowast

Location 2 lowast

Location 3 lowast

CR2Location 4

Query 2lowast

Location 5 lowast

Location 6 lowast

lowast refers to an anonymized Boolean value (true or false)

ldquo119905-closenessrdquo if the distance between the distribution of asensitive attribute in the studied class and the distributionin the whole table (population) is less than or equal to athreshold 119905 The distance is measured using Earth MoverrsquosDistance (EMD)

It is shown in [26] that ldquo119905-closenessrdquo ensures higherprivacy than ldquo119897-diversityrdquo

31 119870-Anonymity versus 119897-Diversity and 119905-Closeness Thedifference in nature between microdata tables and LBS CRcaptured data is the main factor that prevents the implemen-tation of ldquo119897-diversityrdquo and ldquo119905-closenessrdquo as privacy metrics inLPPMs

Anonymizing microdata keeps the sensitive attributesintact and hides parts of the quasi-identifiers to preventuniquely identifying an entity An entity can only be iden-tified as an unknown record in a class ldquo119897-diversityrdquo and ldquo119905-closenessrdquo try to keep the relationship between the entities ofa class and its sensitive attributes as private as possible usingthe techniques discussed previously

LBS anonymization tries on the contrary to hide thesensitive data instead of generalizing the quasi-identifierswhich are needed by the location-based server (locationexpertise and query) to be able to respond to the userrsquos queryTable 3 represents the LBS data stored in the anonymizer

The anonymized data that are transmitted to the LBSserver and could be captured by the adversary are shown inTable 4

As can be seen in the microdata anonymization in Tables1 and 2 and LBS anonymization in Tables 3 and 4 the latterfocuses on hiding the sensitive data which totally negates theaim behind publishing microdata

Mobile Information Systems 7

Based on the above discussion we can deduce that 119897-diversity and 119905-closeness are not suitable for LBS anonymiza-tion thus 119870-anonymity remains the best suitable privacynotion suitable for such services

We have shown in the previous sections that LBS appli-cations are in need of privacy preserving mechanisms basedon valid privacy notions We have also shown that thealready implemented privacy notion119870-anonymity could beexploited using various attacks but stricter privacy notionsare not suitable We can conclude that LPPMs using 119870-anonymity are the best available solutions To prove the needfor a novel privacy concept we will present in the comingsections a newattack on119870-anonymity thatwehave developed[29]

4 Mathematical Model

We will prove in this section that 119870-anonymity is private ifldquozero prior knowledgerdquo is assumed and the adversary couldnot perform the attack for a long continuous period

Let 1199091 1199092 119909

119899 be the set of users found in a CR

where ldquo119899rdquo is its size and 119899 ge 1 Let ldquo119909rdquo be our investigateduser If 119909 notin 119909

1 1199092 119909

119899 that is 119909 notin CR then 119909 is for sure

not the requestor The probability of a user being added to aCR without being the true requestor is

119875 (119909 isin CR (119909119894= requestor amp 119909 = 119909

119894))

=119875 (119909 isin CR cap 119909

119894= requestor)

119875 (119909119894= requestor)

=119899 minus 1

119873 minus 1

(1)

where119873 is the number of users in the area controlled by thestudied anonymizer where 1 le 119899 le 119873 The probability of auser being added to a CR while being the true requestor is

119875 (119909 isin CR (119909 = requestor)) = 1 (2)

We can deduce that the probability of a user being addedto a CR is119875 (119909 isin CR)

=

119873

sum

119894=1

119875 (119909 isin CR119909119894= requestor) sdot 119875 (119909

119894= requestor)

(3)

If all the users have the same frequency of sendingrequests (ie 119909

119894are equiprobable ie 119875(119909

119894= requestor) =

1119873) we can deduce that119875 (119909 = requestor119909 isin CR)

=119875 (119909 = requestor119909 isin CR)

119875 (119909 isin CR)=

(1119873)

(119899119873)=

1

119899

(4)

Based on the assumption that all users are equiprobable(since the adversary has ldquozero prior knowledgerdquo and could notperform the attack for a long continuous period he cannotassume otherwise) 119870-anonymity is satisfied if the cloakingregion size is greater than119870 that is 119899 ge 119870 We will prove inthe coming section that the privacy offered by119870-anonymitycould be breached using our proposed attack (frequencyattack) if the adversary is able to perform the attack for a longperiod even if ldquozero prior knowledgerdquo is assumed

5 Frequency Attack

Until now we have considered that all users are equiprobablebut in reality they are not Consider an LBS which can beused by a user to retrieve the location of the nearest shop(bakery Chinese restaurant etc) While being home I amaware of the most important places which I usually accessand I know the bakeries barbers supermarkets and so forthwithin my town so I need no support in locating these placeswhile on the contrary I can help othersWhen visiting places Iamnew to it will be hard forme to find certain targets (metrosupermarket etc) especially when facing language problemsMy rate of asking for guidance at these new places is muchhigher than that while being atmy known region Let comfortzone be a set of areas where a certain user is familiar with andneeds no support from the studied LBS in finding places Auserrsquos comfort zone includes his hometown and workplaceOutside this comfort zone the query frequency of this userincreases

Based on the above discussion we can easily assumethat the usersrsquo frequencies are not equal that is exist119894119875(119909

119894=

requestor) = 1119873If a user ldquo119894rdquo is to be found in a CR then either he is the real

requestor or he is not and selected from the ldquo119873minus1rdquo remainingusers

If user ldquo119894rdquo is the real requestor (ie 119894 = 119909) then 119875(119909 isin

CR119909 = requestor) = 1 since the real requestor should be inthe CR else he will not receive a valid answer

If user ldquo119894rdquo is not the real requestor (ie 119894 = 119909) then 119875(119909 isin

CR119909 = requestor) = (119899minus1)(119873minus1) that is one of the ldquo119899minus1rdquoselected users from the ldquo119873 minus 1rdquo remaining users

Let 119875119894= 119875(119909

119894= requestor) then

119875 (119909 isin CR)

=

119873

sum

119894=1

119875 (119909 isin CR119909119894= requestor) sdot 119875 (119909

119894= requestor)

=

119873

sum

119894=1

119894 =119909

(119899 minus 1

119873 minus 1)119875119894+ (1) 119875

119909= (

119899 minus 1

119873 minus 1)

119873

sum

119894=1

119894 =119909

119875119894+ 119875119909

Since119873

sum

119894=1

119875119894= 1 then

119873

sum

119894=1

119894 =119909

119875119894= 1 minus 119875

119909

(5)

then

119875 (119909 isin CR) = (119899 minus 1

119873 minus 1) (1 minus 119875

119909) + 119875119909

=119899 minus 1

119873 minus 1+ (

119873 minus 119899

119873 minus 1)119875119909

(6)

let

119886 = (119873 minus 119899

119873 minus 1)

119887 = (119899 minus 1

119873 minus 1)

(7)

8 Mobile Information Systems

then

119875 (119909119894isin CR) = 119886119875

119894+ 119887 (8)

The adversary even if he has ldquozero prior knowledgerdquocan collect CRs used by the studied application through asilent passive attack These captured CRs help the attackerto estimate the requesting frequency of each user using thefollowing procedure

Let 119909119894119895be a random variable representing the occurrence

of user ldquo119894rdquo in the 119895th collected CR (CR119895) sdot 119909119894119895= 1 if user ldquo119894rdquo

(119909119894) is found in CR

119895and 119909

119894119895= 0 otherwise Considering that

the attacker has captured ldquo119898rdquo CRs the average number ofoccurrences of user ldquo119894rdquo in the collected CRs is

119883119894=

1

119898

119898

sum

119895=1

119909119894119895 (9)

We can estimate the population rate (the real rate of user119894) from the sample rate with a confidence interval as follows

119875 (119909119894isin CR) isin [119883

119894minus 120572119878radic

119898 minus 1

119898 119883119894+ 120572119878radic

119898 minus 1

119898] 997904rArr

119875 (119909119894isin CR)

isin

[119883119894minus 119905120572

119878radic119898 minus 1

119898 119883119894+ 119905120572

119878radic119898 minus 1

119898] if 119898 lt 30

[119883119894minus

120572119878

radic119898 119883119894+

120572119878

radic119898] elsewhere

(10)

ldquo119905rdquo follows Studentrsquos 119905-distribution of ldquo119898 minus 1rdquo degreesof freedom (using Bayesian prediction) and ldquo120572rdquo followsnormal distribution (using central limit theorem) We cannow deduce 119875

119894since

997904rArr 119875 (119909 = requestor119909 isin CR) =119875119909

sum119895isinCR 119875119895

=1

119899 (11)

We have shown in this section that even with ldquozeroprior knowledgerdquo the adversary is able to estimate the userrsquosrequest rate through a passive attack aiming to collect CRsThis attack takes advantage of the fact that not all users havethe same request rate that is each put a different amount oftime utilizing a certain application User request rate escalateswhen being outside of his comfort zone which helps indifferentiating him from other users with lower request rateAs the duration of this attack increases the attackerrsquos accuracyincreases Using the above estimations the adversary is ableto identify whether a user found in a CR is the real requestorwith a probability different than (greater or less than) ldquo1119899rdquowhich results in lower entropy (uncertainty) and thus betterpredictability and this breaches the119870-anonymity constraint

We will show in the coming section the entropy and rateof successful prediction of our proposed attack

6 Simulation Results

In this section we have simulated using a specially craftedC code 50 users within an anonymizerrsquos controlled zone

Rate of successful prediction using frequency attack

0

5

10

15

20

25

30

35

40

()

20 50 7010()

Estimated rate of successful prediction using K-anonymity

Figure 6 Prediction success rate for CR = 5

Rate of successful prediction using frequency attack

0

10

20

30

40

50

()

20 50 7010()

Estimated rate of successful prediction using K-anonymity

Figure 7 Prediction success rate for CR = 4

Each user has either low or high request rates Users with highrequest rate are considered outside their comfort zone andhave 10 times more requests than low rate users The privacylevel (entropy and rate of successful requestor prediction) isstudied for different CR sizes and for sufficient samples Thetraffic factor (high request ratelow request rate = 10) is notproved on any real application but assumed as a particularcase

In Figure 6 we can see the estimated rate of successfulprediction and the rates achieved by the frequency attack for acloaking region of size 5The119909-axis represents the proportionof the population with high request rates while the 119910-axisrepresents the average probability of predicting the requestorsuccessfully

It can be shown in Figure 6 that frequency attack hasincreased the prediction success compared to the estimatedrate by (30-20)20 = 50 when 70 of the populationare outside their comfort zone to (36-20)20 = 60when 20 of the population are outside their comfort zonedepending on the distribution of the requestors (percentageof users outside their comfort zone) In Figure 7 we cansee the estimated rate of successful prediction and the ratesachieved by the frequency attack for a cloaking region of size4 The 119909-axis and 119910-axis are similar to those in Figure 6

Mobile Information Systems 9

Entropy of the samples with CR = 4 using frequency attackEntropy of the samples with CR = 5 using frequency attack

3

32

34

36

38

4

20 50 7010()

Entropy of equiprobable users as estimated by K-anonymity

Figure 8 Entropy for frequency attack (CR = 4 CR = 5) andequiprobable users

It can be shown in Figure 7 that frequency attack hasincreased the prediction success compared to the estimatedrate by (34-25)25 = 30 when 70 of the populationare outside their comfort zone to (44-25)25 = 20when 20 of the population are outside their comfort zonedepending on the distribution of the requestors (percentageof users outside their comfort zone) In Figure 8 we cansee the entropy (uncertainty) estimated by 119870-anonymityThe 119909-axis represents the proportion of the population withhigh request rates while the 119910-axis represents mechanismrsquosentropy

It can be shown in Figure 8 that the entropy (uncertaintyof knowing the requestor) is lower when using frequencyattack for both CR sizes This means that the attacker is ableto estimate the real requestor with a probability greater thanldquo1119899rdquo which leads to a breach in the ldquo119870-anonymityrdquo privacyconstraint

We have shown in this section that our proposed attackis able to breach119870-anonymity with an incremental rate as theduration of this attack increases

Since we are not able to find any privacy notion that iscapable of ensuring elevated user privacy in crowdsourcedLBS we deduce that a solid privacy solution is needed andthis will be proposed in the coming section

7 The Proposed Solution

Ensuring high privacy level in crowdsourced LBS is not trivialin the following environments (attacker models)

(i) Untrusted service provider in most of the cases theuser has no signed contract with the crowdsourcedlocation-based server thus no privacy obligations areheld especially when no auditing is taking place

(ii) Hostile network crowdsourced LBSs have no meansof managing incoming trafficrsquos privacy level otherthan encryption

(iii) Untrusted service provider in hostile network mostof the LBSs are untrusted and connected to the Inter-net (hostile network) but enforce encrypted trafficThis attackermodel which is currently used results intwo attack categories (insider and neighbor) as shownin Figure 3 Figure 9 shows untrusted service providerin a hostile network

The best privacy can be achieved by a trusted serviceprovider when located within a trusted network In this casethere is no need to use LPPMs or anonymization since theattacker is neither able to access the sent queries (neighborattack) nor able to access the log files (insider attack)Any of the LPPMs found in the literature and surveyed inSection 2 generates both processing and traffic overheadthus eliminating the need for these mechanisms withoutaffecting the privacy level is definitely an achievement interms of security and performance

Since crowdsourced LBS users are in reality mobile usersour proposed solution for this privacy issue is based onour Mobile Cloud Computing architecture named OperatorCentric Mobile Cloud Architecture (OCMCA) [30] We aregoing next to describe briefly OCMCA

71 Operator Centric Mobile Cloud Architecture To decreasethe delay cost and power consumption and increase theprivacy mobility and scalability compared to other mobileCloud architectures we proposed to install a ldquoCloud serverrdquowithin the mobile operatorrsquos network as shown in Figure 10Its proposed position leads to the following

(i) Less delay the traffic does not need to pass throughthe Internet to reach the destined server

(ii) Less cost the same data contents could be deliv-ered to various users using 1 multicast channelwhich decreases congestion (scalability) and allowedcheaper pricing models

(iii) Less power consumption userrsquos jobs are computed atthe Cloud server which eliminates the need for in-house execution

(iv) More mobility a user can maintain a session with theCloud server as long as he is connected to the mobilenetwork Our mobile Cloud architecture benefitsfrom the handover mechanisms in place

(v) More scalability the Cloud servers are centralizedand accessible by all the users

(vi) More privacy the Cloud servers are managed by atrusted and secure entity as it will be shown in moredetail in the next subsection

Since users are usually connected with the same operatorfor long durations we recommend the userrsquos CSP to federatesome resources at the ldquoCloud serverrdquo and then offload theuserrsquos applications and environment settings In this caseall computation-intensive processing is implemented within

10 Mobile Information Systems

Serving GWPDN GW

Cellphone (mobile)eNodeB

MME

LBS

Hostile network Untrusted service provider

Internet

Figure 9 Untrusted service provider in hostile network

Serving GW

MBMS GW

BM-SC

PDN GW

Cloud server

Cellphone (mobile)

Smartphone (mobile)

eNodeB

MME

Trusted network Server under trusted monitoring (trusted server)

Figure 10 OCMCA

the mobile operatorrsquos network and the terminal generateddata are offloaded to the local Cloud without the need toaccess the Internet (which decreases the cost per bit ofthe transmitted data) This Cloud should be able to triggerbroadcast messages when needed 3GPP has standardizedmulticast and broadcast packet transmission in UMTS andbroadcast transmission in LTE through a feature namedMultimedia MulticastBroadcast Service (MBMS) which wasdefined in 3GPPrsquos technical specification as follows

(i) ldquoMBMS is a point-to-multipoint service inwhich datais transmitted from a single source entity to multiplerecipientsrdquo [30]

(ii) Physical broadcasting allows network resources to beshared when transmitting the same data to multiplerecipients [30]

(iii) Its architecture ensures efficient usage of radio-network and core-network resources especially inradio interface [30]

MBMS requires the introduction of two nodes to theEvolved Packet Core (EPC) network [30] which are asfollows

(i) MBMS GW which is responsible for the connectionswith content owners Cloud servers in our case

(ii) BM-SC (Broadcast Multicast Service Center) whichprovides a set of functions for MBMS user services

If MBMS is already enabled in a studied mobile operatorthen MBMS GW and BM-SC are already installed Enablingbroadcast in LTE allows the local Cloud located within themobile operatorrsquos premises to transmit data efficiently to theusers of one or more cells

Mobile Information Systems 11

InternetCellphone(mobile)

Smartphone(mobile)

Destination web callee at other operators

normal Cloud server and so fortheNodeB

Cloud server

BM-SC

Smartphone(mobile)

eNodeB

MBMS GW

MME

Serving GW

PDN GW

SGi

Smartphone(mobile)

Smartphone(mobile)

Normal mobile trafficUnicast Cloud trafficMulticast Cloud traffic

Figure 11 Traffic routes in OCMCA

The added nodes (MBMSGWBM-SC andCloud server)are transparent to normal mobile traffic Mobile traffic willpass as follows

(i) Normal mobile traffic eNodeB Serving GW andPacket Data Network GateWay (PDNGW) shown inFigure 11

(ii) Unicast Cloud traffic eNodeB Serving GW ldquoCloudserverrdquo Serving GW and eNodeB shown in Figure 11

(iii) Multicast Cloud traffic eNodeB Serving GW ldquoCloudserverrdquo BM-SC and MBMS GW shown in Figure 11

72 Privacy Guaranteeing Architecture One of the ldquoCloudserverrsquosrdquo capabilities is to offer Software-as-a-Service (SaaS)on behalf of application developers An application developercan request the operator to host his service (application)After reviewing the source code the operator agrees onhosting this service and signs an SLA with the developer

A similar protocol is implemented byApple before addingany application to Apple Store [31]The application developertrusts Apple for not spreading the source code and for billing(charging every downloaded instance)

In our case the LBS provider will delegate his applicationto be provided by the mobile operator on his behalf as shownin Figure 12

As can be seen in Figure 12 the LBS query does notneed to pass in a hostile network which eliminates neighborattacks Since the offered service is under the operatorrsquosmonitoring the provider becomes trusted and insider attacks

become eliminated The application verification and billingmethods suitable for the operator could be used withoutaffecting the proposed solution

Since LPPMs have been proposed to prevent an attackerwho captured LBS queries from identifying the requestor ortracing the users we can consider that LPPMs are not neededanymore in this scenario since attackers are not capable ofcapturing the transmitted requests Based on this discussionwe consider the following

(i) User identity privacy is maintained The operatorhas access to userrsquos real identity (IMSI MSISDNand name) and generates temporary identities (TMSIGUTI etc) to prevent real identity breach Userrsquosidentity privacy is maintained since the operator isconsidered a trusted entity but in all cases it alreadyhas access to this information thus our proposedsolution does not offer any new information that theoperator could benefit from

(ii) User location privacy is maintained The operatoralready has access to any userrsquos location and does notbenefit from additional sensitive information leadingto a transparent and strong location privacy solutionfor the user

8 Conclusion

User privacy in crowdsourced LBS is unacceptable in itscurrent state Available location privacy preserving mecha-nisms result in unnecessary overhead without being able to

12 Mobile Information Systems

Serving GW

MBMS GWBM-SC

PDN GW

Cloud server

Cellphone (mobile)

Smartphone (mobile)

eNodeB

MME

Trusted networkServer under trusted monitoring (trusted

server)

LBS

Delegatedservices

Figure 12 Crowdsourced LBS delegation

satisfy the needed requirementsWe have shown that even thestate-of-the-art approach such as119870-anonymity is not suitableto fulfill the required privacy needs as it is vulnerable toour defined frequency attack Thus a new privacy model isneeded and this is what we have proposed in this paper

The delegation of services into the Cloud server inside theoperatorrsquos trusted network prevents all the attacks proposedby the attacker model shown in Figure 3

Competing Interests

The authors declare that they have no competing interests

References

[1] Google Cloud httpsCloudgooglecom[2] A Bozzon M Brambilla and S Ceri ldquoAnswering search

queries with CrowdSearcherrdquo in Proceedings of the 21st AnnualConference on World Wide Web (WWW rsquo12) pp 1009ndash1018ACM Perth Australia April 2012

[3] B L Bayus ldquoCrowdsourcing new product ideas over timean analysis of the Dell IdeaStorm communityrdquo ManagementScience vol 59 no 1 pp 226ndash244 2013

[4] J Howe Crowdsourcing How the Power of the Crowd Is Drivingthe Future of Business Random House New York NY USA2008

[5] M F Bulut Y S Yilmaz and M Demirbas ldquoCrowdsourcinglocation-based queriesrdquo in Proceedings of the 9th IEEE Interna-tional Conference on Pervasive Computing and CommunicationsWorkshops (PERCOM Workshops rsquo11) pp 513ndash518 SeattleWash USA March 2011

[6] D Horowitz and S D Kamvar ldquoThe anatomy of a large-scalesocial search enginerdquo in Proceedings of the 19th InternationalWorld Wide Web Conference (WWW rsquo10) pp 431ndash440 RaleighNC USA April 2010

[7] G Chatzimilioudis A Konstantinidis C Laoudias and DZeinalipour-Yazti ldquoCrowdsourcing with smartphonesrdquo IEEEInternet Computing vol 16 no 5 pp 36ndash44 2012

[8] Chachacom httpwwwchachacom[9] Foursquarecom httpsfoursquarecom

[10] W S Lasecki RWesley J Nichols A Kulkarni J F Allen and JP Bigham ldquoChorus a crowd-powered conversational assistantrdquoin Proceedings of the 26th Annual ACM Symposium on UserInterface Software and Technology (UIST rsquo13) pp 151ndash162 StAndrews UK October 2013

[11] A Noulas S Scellato C Mascolo and M Pontil ldquoAn empiricalstudy of geographic user activity patterns in foursquarerdquo inProceedings of the International Conference on Weblogs andSocial Media AAAI Barcelona Spain July 2011

[12] How effective drug addiction treatment httpwwwdrugabusegovpublicationsprinciples-drug-addiction-treatment-re-search-based-guide-third-editionfrequently-asked-questionshow-effective-drug-addiction-treatment

[13] Drug Rehab Statistics httpwwwfuturesofpalmbeachcomdrug-rehabstatistics

[14] J Bou Abdo H Chaouchi and J Demerjian ldquoSecurity inemerging 4G networksrdquo in Next-Generation Wireless Technolo-gies 4G and Beyond pp 243ndash272 Springer London UK 2013

[15] Location-Based Service httpenwikipediaorgwikiLocation-based service

[16] M E Nergiz M Atzori and Y Saygin ldquoTowards trajectoryanonymization a generalization-based approachrdquo in Proceed-ings of the SIGSPATIAL ACM GIS International Workshop onSecurity and Privacy in GIS and LBS (SPRINGL rsquo08) pp 52ndash61November 2008

[17] S SteinigerMNeun andA Edwardes Foundations of LocationBased Services Lecture Notes on LBS v 10 2006

[18] D Quercia N Lathia F Calabrese G Di Lorenzo and JCrowcroft ldquoRecommending social events from mobile phonelocation datardquo in Proceedings of the IEEE 10th InternationalConference on Data Mining (ICDM rsquo 10) pp 971ndash976 IEEESydney Australia 2010

[19] B Jiang and X Yao ldquoLocation-based services and GIS inperspectiverdquo Computers Environment and Urban Systems vol30 no 6 pp 712ndash725 2006

[20] G Ghinita P Kalnis A Khoshgozaran C Shahabi and K-LTan ldquoPrivate queries in location based services anonymizersare not necessaryrdquo in Proceedings of the ACM SIGMOD Interna-tional Conference on Management of Data (SIGMOD rsquo08) pp121ndash132 ACM 2008

[21] G Ghinita P KalnisM Kantarcioglu and E Bertino ldquoA hybridtechnique for private location-based queries with database

Mobile Information Systems 13

protectionrdquo in Advances in Spatial and Temporal DatabasesN Mamoulis T Seidl T B Pedersen K Torp and I AssentEds vol 5644 of Lecture Notes in Computer Science pp 98ndash116Springer Berlin Germany 2009

[22] L Sweeney ldquo119896-anonymity a model for protecting privacyrdquoInternational Journal of Uncertainty Fuzziness and Knowledge-Based Systems vol 10 no 5 pp 557ndash570 2002

[23] Generation Partnership Project 3GPP T TR 33821 V900(2009-06) 3GPP Rationale and Track of Security Decisionsin Long Term Evolved (LTE) RAN3GPP System ArchitectureEvolution (SAE) (Release 9)

[24] J Bou Abdo J Demerjian and H Chaouchi ldquoSecurity VSQos for LTE authentication and key agreement protocolrdquoInternational Journal of Network Security amp Its Applications vol4 no 5 pp 71ndash82 2012

[25] A Machanavajjhala D Kifer J Gehrke and M Venkitasub-ramaniam ldquoL-diversity privacy beyond k-anonymityrdquo ACMTransactions on Knowledge Discovery from Data vol 1 no 1article 3 2007

[26] N Li T Li and S Venkatasubramanian ldquot-closeness privacybeyond k-anonymity and l-diversityrdquo in Proceedings of the IEEE23rd International Conference on Data Engineering pp 106ndash115Istanbul Turkey April 2007

[27] X Liu K Liu L Guo X Li and Y Fang ldquoA game-theoreticapproach for achieving k-anonymity in Location Based Ser-vicesrdquo in Proceedings of the 32nd IEEE Conference on ComputerCommunications (IEEE INFOCOM rsquo13) pp 2985ndash2993 TurinItaly April 2013

[28] D Rebollo-Monedero J Forne and J Domingo-Ferrer ldquoFromt-closeness-like privacy to postrandomization via informationtheoryrdquo IEEE Transactions on Knowledge and Data Engineeringvol 22 no 11 pp 1623ndash1636 2010

[29] J B Abdo J Demerjian H Chaouchi T Atechian and CBassil ldquoPrivacy using mobile cloud computingrdquo in Proceedingsof the 5th International Conference on Digital Information andCommunication Technology and Its Applications (DICTAP rsquo15)pp 178ndash182 IEEE Beirut Lebanon May 2015

[30] J B Abdo J Demerjian H Chaouchi K Barbar andG PujolleldquoOperator centric mobile cloud architecturerdquo in Proceedings ofthe IEEE Wireless Communications and Networking Conference(WCNC rsquo14) pp 2982ndash2987 IEEE Istanbul Turkey April 2014

[31] Submitting Your Apple to the Store httpsdeveloperApplecomlibraryiosdocumentationIDEsConceptualAppDistri-butionGuideSubmittingYourAppSubmittingYourApphtml

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 4: Research Article Extended Privacy in Crowdsourced Location ...downloads.hindawi.com/journals/misy/2016/7867206.pdf · popular application o ering crowdsourced location-based service

4 Mobile Information Systems

Access point

Internet

Anonymity server

LBSLocationrequestanswer

Cloaked region

Figure 5 Anonymization using119870-anonymity

The scenario behind Figure 4 is that user A currentlyfound at location 01001 is interested in finding a certainservice which is returned by the LBS server as located at 01110

119870-anonymity is another proposedmethod to ensure loca-tion privacy where a location-dependent query is consideredprivate if the attacker is able to identify the requester withprobability less than 1119870 [22] 119870 is a threshold required bythe user [18]Thismethod uses an intermediate trusted servercalled anonymizer as shown in Figure 5

Every subscribed user has to register with a trustedanonymity server (anonymizer) by updating its identity andlocation This anonymizer maintains an updated databaseof userrsquos current location Each triggered query passes bythe anonymizer it replaces the userrsquos location by a cloakingregion (CR) and forwards the request to LBS server CRcontains in addition to the real user 119870 minus 1 users belongingto its neighborhood The attacker knows that one withinthis vicinity has requested this query but the probability ofidentifying the right user equals 1119870 The LBS server replieswith a list containing the identity of each user from CR withits corresponding point of interest (POI) (ie queryrsquos answer)The anonymizer then filters the POI corresponding to the realuser and passes it to the requester [17 18] 119870-anonymity alsohas drawbacks [17 18]

(i) The anonymizer is considered a single point of failureand bottleneck thus the probability of full outage ishigher than that in pseudonym anonymization

(ii) Malicious users can be physically located near thevictim thus the anonymizer will add these users inits CR Since the eavesdropper knows the malicioususers it can predict the userrsquos identity with probabilitygt 1119870

(iii) 119870-anonymity is also vulnerable to correlation attacks(iv) 119870-anonymityrsquos security level is directly proportional

to the frequency of usersrsquo location update It is notpractical and scalable to request location updatesperiodically from all users

(v) It is more profitable for idle users not to updatetheir location since moving from LTE IDLE toLTE ACTIVE to send this update message will causehigher battery consumption and excess signaling oncore level

(vi) Generated core traffic is 119870 minus 1 times more than whatis needed

(vii) Only identity and identity-query privacy are ensuredbut not location privacy

(viii) It is difficult to support continuous LBS

Crowdsourcing applications differ from legacy LBSs inthe resolution mechanism since one or more employeesrecruited from the crowd will participate in answering thequery but userrsquos interface and query mechanism (where theprivacy mechanisms lay) are slightly changed This makesthe privacy requirements and location privacy preservingmechanism that are used in legacy LBS remain valid incrowdsourcing applications

The privacy requirements applicable for legacy andcrowdsourced LBS are surveyed in the next subsection

22 Privacy Requirements of Crowdsourced LBS ApplicationsEvery crowdsourced LBS should maintain a set of privacyrequirements in order to be considered privacy preservingThese requirements are used to prevent the disclosure of sen-sitive information that could be used as part of more complexattacks The attacks range from tracking the user to profilinghis attitudes and interests LBSrsquos privacy requirements are asfollows

(i) Location privacy in crowdsourced LBS location-identity relationship is relatively easy to be exposedusing correlation attacks [14]

(ii) Identity privacy exposing a userrsquos nonrepudiatedidentity can expose him to location tracking byexploiting the well-known Evolved Packet System(EPS)Universal Mobile Telecommunication System(UMTS) tracking vulnerabilities [23 24]

(iii) Identity-query privacy LBS requests have no system-wide effect (ie wrong information will affect theuserrsquos request only without having side effects onother users) Identity traceback is not required

Location privacy preserving mechanisms (LPPMs) usedby LBS applications are surveyed in the next subsection

23 Location Privacy Preserving Mechanisms In this subsec-tion we survey the LPPMs that were proposed for legacy LBS[14] which are as follows

(i) Anonymization and obfuscation anonymization isreplacing the username part of the LBS query with

Mobile Information Systems 5

a pseudonym obfuscation is responsible for distort-ing the LBS queryrsquos second part (location) Variouspseudonym generation mechanisms are found suchas Mix zones [14]

(ii) Private Information Retrieval (PIR) both request andreply are encrypted leaving the server unable toidentify the sender or the request This mechanism issuitable for legacy LBS but impractical in crowdsourc-ing since the search engines need to send the crowda plaintext job

(iii) Feeling-based approach this is achieved by identi-fying a public region and sending a request withdisclosed location that must be at least as popular asthat space

24 Attacks on 119870-Anonymity In this subsection we surveythe attacks on 119870-anonymity To do so the following termsshould be defined first

(i) Quasi-identifiers a group of attributes that canuniquely identify an entity (eg age ZIP code)

(ii) Sensitive attributes information which results inprivacy breaching if matched to an entity

The term ldquocloaking region (CR)rdquo used in Section 2 isidentical to the term ldquoclassrdquo used by the authors of [25 26]who defined it as ldquoa set of records that are indistinguishablefrom each other with respect to certain identifying attributesrdquo[25]Wewill only use CR in this paper tomaintain coherence

Several attacks on 119870-anonymity have been discussed inthe literature which are as follows

(i) Selfish behavior based attack the authors of [27]showed the behavior of selfish users in ldquoanonymiza-tion and obfuscationrdquo LPPM and its effect on privacyprotection In the areas where the number of LBSusers is less than ldquo119896rdquo a requester can generate dummyusers so that its CR can again contain ldquo119896rdquo users(real and dummy) in order to achieve 119870-anonymityOther selfish users (free riders) can benefit from thegenerated dummy users to obfuscate their requestsThe cost of generating dummy users is not distributedfairly among the users benefiting from this LPPM

(ii) Homogeneity attack [25 26] this attack takes advan-tage of the fact that ldquo119870-anonymity creates groupsthat leak information due to lack of diversity in thesensitive attributerdquo [25] The attack expects that thesearched attribute (sensitive attribute) is the same forall the users within the sameCRAn example showingthis attack is implemented as follows consider themedical records of a certain population shown inTable 1 and its anonymized version published inTable 2 [25] Alice tries to find the medical situationof Bob who is included in Table 1 Bob is a 31-year-old Americanmale who lives in ZIP code 13053 Alicecan deduce that Bob has cancer since all the membersbelonging to his group (CR = 9 10 11 12) have thesame disease

Table 1 Medical records of a sample population

Nonsensitive (quasi-identifiers) SensitiveZIP code Age Nationality Situation

1 13053 28 Russian Heart disease2 13068 29 American Heart disease3 13068 21 Japanese Viral infection4 13053 23 American Viral infection5 14853 50 Indian Cancer6 14853 55 Russian Heart disease7 14850 47 American Viral infection8 14850 49 American Viral infection9 13053 31 American Cancer10 13053 37 Indian Cancer11 13068 36 Japanese Cancer12 13068 35 American Cancer

(iii) Background knowledge attack this attack takesadvantage of the fact that ldquo119870-anonymity does not pro-tect against attacks based on background knowledgerdquo[25] which are not included in the offered informa-tion An example showing this attack is implementedas follows Alice tries to know the medical situationof her friend Umeko who is admitted to the samehospital as Bob so his records are also shown inTable 1 Umeko is a 21-year-old Japanese female whocurrently lives in ZIP code 13068 thus belonging tothe first CR = 1 2 3 4 Alice can deduce thatUmeko has either viral infection or heart disease It iswell known that the Japanese have an extremely lowincidence of heart disease thus Alice is nearly surethat Umeko has a viral infection

As shown previously 119870-anonymity has various weak-nesses and it would be interesting if we are able to replaceit Advanced privacy notions have been proposed to replace119870-anonymity for microdata publishing It looks tempting toadopt these notions into LBS In the coming section we aregoing to show that none of the advanced privacy notionsfound in the literature except for119870-anonymity is suitable forcrowdsourced location-based services

3 119870-Anonymity 119897-Diversity and 119905-Closeness

119870-anonymity is considered not enough to ensure the privacyneeded for microdata publishing Microdata are tables thatcontain unaggregated information which include medicalvoter registration census and customer data They areusually used as a valuable source of information for theallocation of public funds medical research and trendanalysis Table 1 is a sample microdata table while Table 2is an anonymized microdata table Other stronger privacynotions were proposed in the literature which are 119897-diversity[25] and 119905-closeness [26 28]

The ldquo119897-diversityrdquo privacy notion states that a CR is consid-ered to have 119897-diversity if there are at least 119897 ldquowell-representedrdquovalues [25] ldquoWell-representedrdquo can be interpreted as follows

6 Mobile Information Systems

Table 2 Anonymized records of the sample population

Nonsensitive (quasi-identifiers) SensitiveZIP code Age Nationality Situation

1 130lowastlowast lt30 lowast Heart disease2 130lowastlowast lt30 lowast Heart disease3 130lowastlowast lt30 lowast Viral infection4 130lowastlowast lt30 lowast Viral infection5 1485lowast gt40 lowast Cancer6 1485lowast gt40 lowast Heart disease7 1485lowast gt40 lowast Viral infection8 1485lowast gt40 lowast Viral infection9 130lowastlowast 3lowast lowast Cancer10 130lowastlowast 3lowast lowast Cancer11 130lowastlowast 3lowast lowast Cancer12 130lowastlowast 3lowast lowast Cancerlowast in the ZIP code column refers to missing data lowastlowastmeans that 2 digits areanonymized lowast in the Nationality column is used to identify an anonymizednationality

(i) Distinct 119897-diversity it contains at least 119897 distinctvalues

(ii) Entropy 119897-diversity entropy of the values thatoccurred is greater than or equal to log 119897

(iii) Recursive 119897-diversity the most frequent value doesnot appear toomuch and the less frequent value doesnot appear too rarely

A table is considered to have 119897-diversity if all its CRs have119897-diversity Various attacks have been proposed against thisprivacy notion which results in privacy disclosure [26]Theseattacks are as follows

(i) Skewness attack [26] skewed distributions couldsatisfy 119897-diversity but still do not prevent attributedisclosure This happens when the frequency ofoccurrence of a certain sensitive attribute inside aCR differs from its frequency in the whole table(population) Consider the drug addiction rate in acertain population to be 1 One of the classes hasan addiction rate of 50 This class is consideredattribute disclosing since a member of this class isidentified as addict with a high probability

(ii) Similarity attack [26] distinct but semantically simi-lar sensitive values could satisfy 119897-diversity but meanthe same for the attacker If the sensitive informationshown in a table contains cocaine addict heroinaddict marijuana addict and so forth as distinctentries it could be considered satisfying 119897-diversitybut for a drug dealer all these entries are consideredas a target

The ldquo119905-closenessrdquo concept is another privacy notion thatdifferentiates between the information gained about thewhole population and that about specific individuals [26]As the first gain is tolerated and motivated the second isconsidered privacy breaching A class is considered to have

Table 3 Data stored at the LBS anonymizer

Quasi-identifiers Sensitive dataCloaking region Location Query True sender

CR1Location 1

Query 1No

Location 2 NoLocation 3 Yes

CR2Location 4

Query 2No

Location 5 YesLocation 6 No

Table 4 Data captured by the adversary

Quasi-identifiers Sensitive dataCloaking region Location Query True sender

CR1Location 1

Query 1lowast

Location 2 lowast

Location 3 lowast

CR2Location 4

Query 2lowast

Location 5 lowast

Location 6 lowast

lowast refers to an anonymized Boolean value (true or false)

ldquo119905-closenessrdquo if the distance between the distribution of asensitive attribute in the studied class and the distributionin the whole table (population) is less than or equal to athreshold 119905 The distance is measured using Earth MoverrsquosDistance (EMD)

It is shown in [26] that ldquo119905-closenessrdquo ensures higherprivacy than ldquo119897-diversityrdquo

31 119870-Anonymity versus 119897-Diversity and 119905-Closeness Thedifference in nature between microdata tables and LBS CRcaptured data is the main factor that prevents the implemen-tation of ldquo119897-diversityrdquo and ldquo119905-closenessrdquo as privacy metrics inLPPMs

Anonymizing microdata keeps the sensitive attributesintact and hides parts of the quasi-identifiers to preventuniquely identifying an entity An entity can only be iden-tified as an unknown record in a class ldquo119897-diversityrdquo and ldquo119905-closenessrdquo try to keep the relationship between the entities ofa class and its sensitive attributes as private as possible usingthe techniques discussed previously

LBS anonymization tries on the contrary to hide thesensitive data instead of generalizing the quasi-identifierswhich are needed by the location-based server (locationexpertise and query) to be able to respond to the userrsquos queryTable 3 represents the LBS data stored in the anonymizer

The anonymized data that are transmitted to the LBSserver and could be captured by the adversary are shown inTable 4

As can be seen in the microdata anonymization in Tables1 and 2 and LBS anonymization in Tables 3 and 4 the latterfocuses on hiding the sensitive data which totally negates theaim behind publishing microdata

Mobile Information Systems 7

Based on the above discussion we can deduce that 119897-diversity and 119905-closeness are not suitable for LBS anonymiza-tion thus 119870-anonymity remains the best suitable privacynotion suitable for such services

We have shown in the previous sections that LBS appli-cations are in need of privacy preserving mechanisms basedon valid privacy notions We have also shown that thealready implemented privacy notion119870-anonymity could beexploited using various attacks but stricter privacy notionsare not suitable We can conclude that LPPMs using 119870-anonymity are the best available solutions To prove the needfor a novel privacy concept we will present in the comingsections a newattack on119870-anonymity thatwehave developed[29]

4 Mathematical Model

We will prove in this section that 119870-anonymity is private ifldquozero prior knowledgerdquo is assumed and the adversary couldnot perform the attack for a long continuous period

Let 1199091 1199092 119909

119899 be the set of users found in a CR

where ldquo119899rdquo is its size and 119899 ge 1 Let ldquo119909rdquo be our investigateduser If 119909 notin 119909

1 1199092 119909

119899 that is 119909 notin CR then 119909 is for sure

not the requestor The probability of a user being added to aCR without being the true requestor is

119875 (119909 isin CR (119909119894= requestor amp 119909 = 119909

119894))

=119875 (119909 isin CR cap 119909

119894= requestor)

119875 (119909119894= requestor)

=119899 minus 1

119873 minus 1

(1)

where119873 is the number of users in the area controlled by thestudied anonymizer where 1 le 119899 le 119873 The probability of auser being added to a CR while being the true requestor is

119875 (119909 isin CR (119909 = requestor)) = 1 (2)

We can deduce that the probability of a user being addedto a CR is119875 (119909 isin CR)

=

119873

sum

119894=1

119875 (119909 isin CR119909119894= requestor) sdot 119875 (119909

119894= requestor)

(3)

If all the users have the same frequency of sendingrequests (ie 119909

119894are equiprobable ie 119875(119909

119894= requestor) =

1119873) we can deduce that119875 (119909 = requestor119909 isin CR)

=119875 (119909 = requestor119909 isin CR)

119875 (119909 isin CR)=

(1119873)

(119899119873)=

1

119899

(4)

Based on the assumption that all users are equiprobable(since the adversary has ldquozero prior knowledgerdquo and could notperform the attack for a long continuous period he cannotassume otherwise) 119870-anonymity is satisfied if the cloakingregion size is greater than119870 that is 119899 ge 119870 We will prove inthe coming section that the privacy offered by119870-anonymitycould be breached using our proposed attack (frequencyattack) if the adversary is able to perform the attack for a longperiod even if ldquozero prior knowledgerdquo is assumed

5 Frequency Attack

Until now we have considered that all users are equiprobablebut in reality they are not Consider an LBS which can beused by a user to retrieve the location of the nearest shop(bakery Chinese restaurant etc) While being home I amaware of the most important places which I usually accessand I know the bakeries barbers supermarkets and so forthwithin my town so I need no support in locating these placeswhile on the contrary I can help othersWhen visiting places Iamnew to it will be hard forme to find certain targets (metrosupermarket etc) especially when facing language problemsMy rate of asking for guidance at these new places is muchhigher than that while being atmy known region Let comfortzone be a set of areas where a certain user is familiar with andneeds no support from the studied LBS in finding places Auserrsquos comfort zone includes his hometown and workplaceOutside this comfort zone the query frequency of this userincreases

Based on the above discussion we can easily assumethat the usersrsquo frequencies are not equal that is exist119894119875(119909

119894=

requestor) = 1119873If a user ldquo119894rdquo is to be found in a CR then either he is the real

requestor or he is not and selected from the ldquo119873minus1rdquo remainingusers

If user ldquo119894rdquo is the real requestor (ie 119894 = 119909) then 119875(119909 isin

CR119909 = requestor) = 1 since the real requestor should be inthe CR else he will not receive a valid answer

If user ldquo119894rdquo is not the real requestor (ie 119894 = 119909) then 119875(119909 isin

CR119909 = requestor) = (119899minus1)(119873minus1) that is one of the ldquo119899minus1rdquoselected users from the ldquo119873 minus 1rdquo remaining users

Let 119875119894= 119875(119909

119894= requestor) then

119875 (119909 isin CR)

=

119873

sum

119894=1

119875 (119909 isin CR119909119894= requestor) sdot 119875 (119909

119894= requestor)

=

119873

sum

119894=1

119894 =119909

(119899 minus 1

119873 minus 1)119875119894+ (1) 119875

119909= (

119899 minus 1

119873 minus 1)

119873

sum

119894=1

119894 =119909

119875119894+ 119875119909

Since119873

sum

119894=1

119875119894= 1 then

119873

sum

119894=1

119894 =119909

119875119894= 1 minus 119875

119909

(5)

then

119875 (119909 isin CR) = (119899 minus 1

119873 minus 1) (1 minus 119875

119909) + 119875119909

=119899 minus 1

119873 minus 1+ (

119873 minus 119899

119873 minus 1)119875119909

(6)

let

119886 = (119873 minus 119899

119873 minus 1)

119887 = (119899 minus 1

119873 minus 1)

(7)

8 Mobile Information Systems

then

119875 (119909119894isin CR) = 119886119875

119894+ 119887 (8)

The adversary even if he has ldquozero prior knowledgerdquocan collect CRs used by the studied application through asilent passive attack These captured CRs help the attackerto estimate the requesting frequency of each user using thefollowing procedure

Let 119909119894119895be a random variable representing the occurrence

of user ldquo119894rdquo in the 119895th collected CR (CR119895) sdot 119909119894119895= 1 if user ldquo119894rdquo

(119909119894) is found in CR

119895and 119909

119894119895= 0 otherwise Considering that

the attacker has captured ldquo119898rdquo CRs the average number ofoccurrences of user ldquo119894rdquo in the collected CRs is

119883119894=

1

119898

119898

sum

119895=1

119909119894119895 (9)

We can estimate the population rate (the real rate of user119894) from the sample rate with a confidence interval as follows

119875 (119909119894isin CR) isin [119883

119894minus 120572119878radic

119898 minus 1

119898 119883119894+ 120572119878radic

119898 minus 1

119898] 997904rArr

119875 (119909119894isin CR)

isin

[119883119894minus 119905120572

119878radic119898 minus 1

119898 119883119894+ 119905120572

119878radic119898 minus 1

119898] if 119898 lt 30

[119883119894minus

120572119878

radic119898 119883119894+

120572119878

radic119898] elsewhere

(10)

ldquo119905rdquo follows Studentrsquos 119905-distribution of ldquo119898 minus 1rdquo degreesof freedom (using Bayesian prediction) and ldquo120572rdquo followsnormal distribution (using central limit theorem) We cannow deduce 119875

119894since

997904rArr 119875 (119909 = requestor119909 isin CR) =119875119909

sum119895isinCR 119875119895

=1

119899 (11)

We have shown in this section that even with ldquozeroprior knowledgerdquo the adversary is able to estimate the userrsquosrequest rate through a passive attack aiming to collect CRsThis attack takes advantage of the fact that not all users havethe same request rate that is each put a different amount oftime utilizing a certain application User request rate escalateswhen being outside of his comfort zone which helps indifferentiating him from other users with lower request rateAs the duration of this attack increases the attackerrsquos accuracyincreases Using the above estimations the adversary is ableto identify whether a user found in a CR is the real requestorwith a probability different than (greater or less than) ldquo1119899rdquowhich results in lower entropy (uncertainty) and thus betterpredictability and this breaches the119870-anonymity constraint

We will show in the coming section the entropy and rateof successful prediction of our proposed attack

6 Simulation Results

In this section we have simulated using a specially craftedC code 50 users within an anonymizerrsquos controlled zone

Rate of successful prediction using frequency attack

0

5

10

15

20

25

30

35

40

()

20 50 7010()

Estimated rate of successful prediction using K-anonymity

Figure 6 Prediction success rate for CR = 5

Rate of successful prediction using frequency attack

0

10

20

30

40

50

()

20 50 7010()

Estimated rate of successful prediction using K-anonymity

Figure 7 Prediction success rate for CR = 4

Each user has either low or high request rates Users with highrequest rate are considered outside their comfort zone andhave 10 times more requests than low rate users The privacylevel (entropy and rate of successful requestor prediction) isstudied for different CR sizes and for sufficient samples Thetraffic factor (high request ratelow request rate = 10) is notproved on any real application but assumed as a particularcase

In Figure 6 we can see the estimated rate of successfulprediction and the rates achieved by the frequency attack for acloaking region of size 5The119909-axis represents the proportionof the population with high request rates while the 119910-axisrepresents the average probability of predicting the requestorsuccessfully

It can be shown in Figure 6 that frequency attack hasincreased the prediction success compared to the estimatedrate by (30-20)20 = 50 when 70 of the populationare outside their comfort zone to (36-20)20 = 60when 20 of the population are outside their comfort zonedepending on the distribution of the requestors (percentageof users outside their comfort zone) In Figure 7 we cansee the estimated rate of successful prediction and the ratesachieved by the frequency attack for a cloaking region of size4 The 119909-axis and 119910-axis are similar to those in Figure 6

Mobile Information Systems 9

Entropy of the samples with CR = 4 using frequency attackEntropy of the samples with CR = 5 using frequency attack

3

32

34

36

38

4

20 50 7010()

Entropy of equiprobable users as estimated by K-anonymity

Figure 8 Entropy for frequency attack (CR = 4 CR = 5) andequiprobable users

It can be shown in Figure 7 that frequency attack hasincreased the prediction success compared to the estimatedrate by (34-25)25 = 30 when 70 of the populationare outside their comfort zone to (44-25)25 = 20when 20 of the population are outside their comfort zonedepending on the distribution of the requestors (percentageof users outside their comfort zone) In Figure 8 we cansee the entropy (uncertainty) estimated by 119870-anonymityThe 119909-axis represents the proportion of the population withhigh request rates while the 119910-axis represents mechanismrsquosentropy

It can be shown in Figure 8 that the entropy (uncertaintyof knowing the requestor) is lower when using frequencyattack for both CR sizes This means that the attacker is ableto estimate the real requestor with a probability greater thanldquo1119899rdquo which leads to a breach in the ldquo119870-anonymityrdquo privacyconstraint

We have shown in this section that our proposed attackis able to breach119870-anonymity with an incremental rate as theduration of this attack increases

Since we are not able to find any privacy notion that iscapable of ensuring elevated user privacy in crowdsourcedLBS we deduce that a solid privacy solution is needed andthis will be proposed in the coming section

7 The Proposed Solution

Ensuring high privacy level in crowdsourced LBS is not trivialin the following environments (attacker models)

(i) Untrusted service provider in most of the cases theuser has no signed contract with the crowdsourcedlocation-based server thus no privacy obligations areheld especially when no auditing is taking place

(ii) Hostile network crowdsourced LBSs have no meansof managing incoming trafficrsquos privacy level otherthan encryption

(iii) Untrusted service provider in hostile network mostof the LBSs are untrusted and connected to the Inter-net (hostile network) but enforce encrypted trafficThis attackermodel which is currently used results intwo attack categories (insider and neighbor) as shownin Figure 3 Figure 9 shows untrusted service providerin a hostile network

The best privacy can be achieved by a trusted serviceprovider when located within a trusted network In this casethere is no need to use LPPMs or anonymization since theattacker is neither able to access the sent queries (neighborattack) nor able to access the log files (insider attack)Any of the LPPMs found in the literature and surveyed inSection 2 generates both processing and traffic overheadthus eliminating the need for these mechanisms withoutaffecting the privacy level is definitely an achievement interms of security and performance

Since crowdsourced LBS users are in reality mobile usersour proposed solution for this privacy issue is based onour Mobile Cloud Computing architecture named OperatorCentric Mobile Cloud Architecture (OCMCA) [30] We aregoing next to describe briefly OCMCA

71 Operator Centric Mobile Cloud Architecture To decreasethe delay cost and power consumption and increase theprivacy mobility and scalability compared to other mobileCloud architectures we proposed to install a ldquoCloud serverrdquowithin the mobile operatorrsquos network as shown in Figure 10Its proposed position leads to the following

(i) Less delay the traffic does not need to pass throughthe Internet to reach the destined server

(ii) Less cost the same data contents could be deliv-ered to various users using 1 multicast channelwhich decreases congestion (scalability) and allowedcheaper pricing models

(iii) Less power consumption userrsquos jobs are computed atthe Cloud server which eliminates the need for in-house execution

(iv) More mobility a user can maintain a session with theCloud server as long as he is connected to the mobilenetwork Our mobile Cloud architecture benefitsfrom the handover mechanisms in place

(v) More scalability the Cloud servers are centralizedand accessible by all the users

(vi) More privacy the Cloud servers are managed by atrusted and secure entity as it will be shown in moredetail in the next subsection

Since users are usually connected with the same operatorfor long durations we recommend the userrsquos CSP to federatesome resources at the ldquoCloud serverrdquo and then offload theuserrsquos applications and environment settings In this caseall computation-intensive processing is implemented within

10 Mobile Information Systems

Serving GWPDN GW

Cellphone (mobile)eNodeB

MME

LBS

Hostile network Untrusted service provider

Internet

Figure 9 Untrusted service provider in hostile network

Serving GW

MBMS GW

BM-SC

PDN GW

Cloud server

Cellphone (mobile)

Smartphone (mobile)

eNodeB

MME

Trusted network Server under trusted monitoring (trusted server)

Figure 10 OCMCA

the mobile operatorrsquos network and the terminal generateddata are offloaded to the local Cloud without the need toaccess the Internet (which decreases the cost per bit ofthe transmitted data) This Cloud should be able to triggerbroadcast messages when needed 3GPP has standardizedmulticast and broadcast packet transmission in UMTS andbroadcast transmission in LTE through a feature namedMultimedia MulticastBroadcast Service (MBMS) which wasdefined in 3GPPrsquos technical specification as follows

(i) ldquoMBMS is a point-to-multipoint service inwhich datais transmitted from a single source entity to multiplerecipientsrdquo [30]

(ii) Physical broadcasting allows network resources to beshared when transmitting the same data to multiplerecipients [30]

(iii) Its architecture ensures efficient usage of radio-network and core-network resources especially inradio interface [30]

MBMS requires the introduction of two nodes to theEvolved Packet Core (EPC) network [30] which are asfollows

(i) MBMS GW which is responsible for the connectionswith content owners Cloud servers in our case

(ii) BM-SC (Broadcast Multicast Service Center) whichprovides a set of functions for MBMS user services

If MBMS is already enabled in a studied mobile operatorthen MBMS GW and BM-SC are already installed Enablingbroadcast in LTE allows the local Cloud located within themobile operatorrsquos premises to transmit data efficiently to theusers of one or more cells

Mobile Information Systems 11

InternetCellphone(mobile)

Smartphone(mobile)

Destination web callee at other operators

normal Cloud server and so fortheNodeB

Cloud server

BM-SC

Smartphone(mobile)

eNodeB

MBMS GW

MME

Serving GW

PDN GW

SGi

Smartphone(mobile)

Smartphone(mobile)

Normal mobile trafficUnicast Cloud trafficMulticast Cloud traffic

Figure 11 Traffic routes in OCMCA

The added nodes (MBMSGWBM-SC andCloud server)are transparent to normal mobile traffic Mobile traffic willpass as follows

(i) Normal mobile traffic eNodeB Serving GW andPacket Data Network GateWay (PDNGW) shown inFigure 11

(ii) Unicast Cloud traffic eNodeB Serving GW ldquoCloudserverrdquo Serving GW and eNodeB shown in Figure 11

(iii) Multicast Cloud traffic eNodeB Serving GW ldquoCloudserverrdquo BM-SC and MBMS GW shown in Figure 11

72 Privacy Guaranteeing Architecture One of the ldquoCloudserverrsquosrdquo capabilities is to offer Software-as-a-Service (SaaS)on behalf of application developers An application developercan request the operator to host his service (application)After reviewing the source code the operator agrees onhosting this service and signs an SLA with the developer

A similar protocol is implemented byApple before addingany application to Apple Store [31]The application developertrusts Apple for not spreading the source code and for billing(charging every downloaded instance)

In our case the LBS provider will delegate his applicationto be provided by the mobile operator on his behalf as shownin Figure 12

As can be seen in Figure 12 the LBS query does notneed to pass in a hostile network which eliminates neighborattacks Since the offered service is under the operatorrsquosmonitoring the provider becomes trusted and insider attacks

become eliminated The application verification and billingmethods suitable for the operator could be used withoutaffecting the proposed solution

Since LPPMs have been proposed to prevent an attackerwho captured LBS queries from identifying the requestor ortracing the users we can consider that LPPMs are not neededanymore in this scenario since attackers are not capable ofcapturing the transmitted requests Based on this discussionwe consider the following

(i) User identity privacy is maintained The operatorhas access to userrsquos real identity (IMSI MSISDNand name) and generates temporary identities (TMSIGUTI etc) to prevent real identity breach Userrsquosidentity privacy is maintained since the operator isconsidered a trusted entity but in all cases it alreadyhas access to this information thus our proposedsolution does not offer any new information that theoperator could benefit from

(ii) User location privacy is maintained The operatoralready has access to any userrsquos location and does notbenefit from additional sensitive information leadingto a transparent and strong location privacy solutionfor the user

8 Conclusion

User privacy in crowdsourced LBS is unacceptable in itscurrent state Available location privacy preserving mecha-nisms result in unnecessary overhead without being able to

12 Mobile Information Systems

Serving GW

MBMS GWBM-SC

PDN GW

Cloud server

Cellphone (mobile)

Smartphone (mobile)

eNodeB

MME

Trusted networkServer under trusted monitoring (trusted

server)

LBS

Delegatedservices

Figure 12 Crowdsourced LBS delegation

satisfy the needed requirementsWe have shown that even thestate-of-the-art approach such as119870-anonymity is not suitableto fulfill the required privacy needs as it is vulnerable toour defined frequency attack Thus a new privacy model isneeded and this is what we have proposed in this paper

The delegation of services into the Cloud server inside theoperatorrsquos trusted network prevents all the attacks proposedby the attacker model shown in Figure 3

Competing Interests

The authors declare that they have no competing interests

References

[1] Google Cloud httpsCloudgooglecom[2] A Bozzon M Brambilla and S Ceri ldquoAnswering search

queries with CrowdSearcherrdquo in Proceedings of the 21st AnnualConference on World Wide Web (WWW rsquo12) pp 1009ndash1018ACM Perth Australia April 2012

[3] B L Bayus ldquoCrowdsourcing new product ideas over timean analysis of the Dell IdeaStorm communityrdquo ManagementScience vol 59 no 1 pp 226ndash244 2013

[4] J Howe Crowdsourcing How the Power of the Crowd Is Drivingthe Future of Business Random House New York NY USA2008

[5] M F Bulut Y S Yilmaz and M Demirbas ldquoCrowdsourcinglocation-based queriesrdquo in Proceedings of the 9th IEEE Interna-tional Conference on Pervasive Computing and CommunicationsWorkshops (PERCOM Workshops rsquo11) pp 513ndash518 SeattleWash USA March 2011

[6] D Horowitz and S D Kamvar ldquoThe anatomy of a large-scalesocial search enginerdquo in Proceedings of the 19th InternationalWorld Wide Web Conference (WWW rsquo10) pp 431ndash440 RaleighNC USA April 2010

[7] G Chatzimilioudis A Konstantinidis C Laoudias and DZeinalipour-Yazti ldquoCrowdsourcing with smartphonesrdquo IEEEInternet Computing vol 16 no 5 pp 36ndash44 2012

[8] Chachacom httpwwwchachacom[9] Foursquarecom httpsfoursquarecom

[10] W S Lasecki RWesley J Nichols A Kulkarni J F Allen and JP Bigham ldquoChorus a crowd-powered conversational assistantrdquoin Proceedings of the 26th Annual ACM Symposium on UserInterface Software and Technology (UIST rsquo13) pp 151ndash162 StAndrews UK October 2013

[11] A Noulas S Scellato C Mascolo and M Pontil ldquoAn empiricalstudy of geographic user activity patterns in foursquarerdquo inProceedings of the International Conference on Weblogs andSocial Media AAAI Barcelona Spain July 2011

[12] How effective drug addiction treatment httpwwwdrugabusegovpublicationsprinciples-drug-addiction-treatment-re-search-based-guide-third-editionfrequently-asked-questionshow-effective-drug-addiction-treatment

[13] Drug Rehab Statistics httpwwwfuturesofpalmbeachcomdrug-rehabstatistics

[14] J Bou Abdo H Chaouchi and J Demerjian ldquoSecurity inemerging 4G networksrdquo in Next-Generation Wireless Technolo-gies 4G and Beyond pp 243ndash272 Springer London UK 2013

[15] Location-Based Service httpenwikipediaorgwikiLocation-based service

[16] M E Nergiz M Atzori and Y Saygin ldquoTowards trajectoryanonymization a generalization-based approachrdquo in Proceed-ings of the SIGSPATIAL ACM GIS International Workshop onSecurity and Privacy in GIS and LBS (SPRINGL rsquo08) pp 52ndash61November 2008

[17] S SteinigerMNeun andA Edwardes Foundations of LocationBased Services Lecture Notes on LBS v 10 2006

[18] D Quercia N Lathia F Calabrese G Di Lorenzo and JCrowcroft ldquoRecommending social events from mobile phonelocation datardquo in Proceedings of the IEEE 10th InternationalConference on Data Mining (ICDM rsquo 10) pp 971ndash976 IEEESydney Australia 2010

[19] B Jiang and X Yao ldquoLocation-based services and GIS inperspectiverdquo Computers Environment and Urban Systems vol30 no 6 pp 712ndash725 2006

[20] G Ghinita P Kalnis A Khoshgozaran C Shahabi and K-LTan ldquoPrivate queries in location based services anonymizersare not necessaryrdquo in Proceedings of the ACM SIGMOD Interna-tional Conference on Management of Data (SIGMOD rsquo08) pp121ndash132 ACM 2008

[21] G Ghinita P KalnisM Kantarcioglu and E Bertino ldquoA hybridtechnique for private location-based queries with database

Mobile Information Systems 13

protectionrdquo in Advances in Spatial and Temporal DatabasesN Mamoulis T Seidl T B Pedersen K Torp and I AssentEds vol 5644 of Lecture Notes in Computer Science pp 98ndash116Springer Berlin Germany 2009

[22] L Sweeney ldquo119896-anonymity a model for protecting privacyrdquoInternational Journal of Uncertainty Fuzziness and Knowledge-Based Systems vol 10 no 5 pp 557ndash570 2002

[23] Generation Partnership Project 3GPP T TR 33821 V900(2009-06) 3GPP Rationale and Track of Security Decisionsin Long Term Evolved (LTE) RAN3GPP System ArchitectureEvolution (SAE) (Release 9)

[24] J Bou Abdo J Demerjian and H Chaouchi ldquoSecurity VSQos for LTE authentication and key agreement protocolrdquoInternational Journal of Network Security amp Its Applications vol4 no 5 pp 71ndash82 2012

[25] A Machanavajjhala D Kifer J Gehrke and M Venkitasub-ramaniam ldquoL-diversity privacy beyond k-anonymityrdquo ACMTransactions on Knowledge Discovery from Data vol 1 no 1article 3 2007

[26] N Li T Li and S Venkatasubramanian ldquot-closeness privacybeyond k-anonymity and l-diversityrdquo in Proceedings of the IEEE23rd International Conference on Data Engineering pp 106ndash115Istanbul Turkey April 2007

[27] X Liu K Liu L Guo X Li and Y Fang ldquoA game-theoreticapproach for achieving k-anonymity in Location Based Ser-vicesrdquo in Proceedings of the 32nd IEEE Conference on ComputerCommunications (IEEE INFOCOM rsquo13) pp 2985ndash2993 TurinItaly April 2013

[28] D Rebollo-Monedero J Forne and J Domingo-Ferrer ldquoFromt-closeness-like privacy to postrandomization via informationtheoryrdquo IEEE Transactions on Knowledge and Data Engineeringvol 22 no 11 pp 1623ndash1636 2010

[29] J B Abdo J Demerjian H Chaouchi T Atechian and CBassil ldquoPrivacy using mobile cloud computingrdquo in Proceedingsof the 5th International Conference on Digital Information andCommunication Technology and Its Applications (DICTAP rsquo15)pp 178ndash182 IEEE Beirut Lebanon May 2015

[30] J B Abdo J Demerjian H Chaouchi K Barbar andG PujolleldquoOperator centric mobile cloud architecturerdquo in Proceedings ofthe IEEE Wireless Communications and Networking Conference(WCNC rsquo14) pp 2982ndash2987 IEEE Istanbul Turkey April 2014

[31] Submitting Your Apple to the Store httpsdeveloperApplecomlibraryiosdocumentationIDEsConceptualAppDistri-butionGuideSubmittingYourAppSubmittingYourApphtml

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 5: Research Article Extended Privacy in Crowdsourced Location ...downloads.hindawi.com/journals/misy/2016/7867206.pdf · popular application o ering crowdsourced location-based service

Mobile Information Systems 5

a pseudonym obfuscation is responsible for distort-ing the LBS queryrsquos second part (location) Variouspseudonym generation mechanisms are found suchas Mix zones [14]

(ii) Private Information Retrieval (PIR) both request andreply are encrypted leaving the server unable toidentify the sender or the request This mechanism issuitable for legacy LBS but impractical in crowdsourc-ing since the search engines need to send the crowda plaintext job

(iii) Feeling-based approach this is achieved by identi-fying a public region and sending a request withdisclosed location that must be at least as popular asthat space

24 Attacks on 119870-Anonymity In this subsection we surveythe attacks on 119870-anonymity To do so the following termsshould be defined first

(i) Quasi-identifiers a group of attributes that canuniquely identify an entity (eg age ZIP code)

(ii) Sensitive attributes information which results inprivacy breaching if matched to an entity

The term ldquocloaking region (CR)rdquo used in Section 2 isidentical to the term ldquoclassrdquo used by the authors of [25 26]who defined it as ldquoa set of records that are indistinguishablefrom each other with respect to certain identifying attributesrdquo[25]Wewill only use CR in this paper tomaintain coherence

Several attacks on 119870-anonymity have been discussed inthe literature which are as follows

(i) Selfish behavior based attack the authors of [27]showed the behavior of selfish users in ldquoanonymiza-tion and obfuscationrdquo LPPM and its effect on privacyprotection In the areas where the number of LBSusers is less than ldquo119896rdquo a requester can generate dummyusers so that its CR can again contain ldquo119896rdquo users(real and dummy) in order to achieve 119870-anonymityOther selfish users (free riders) can benefit from thegenerated dummy users to obfuscate their requestsThe cost of generating dummy users is not distributedfairly among the users benefiting from this LPPM

(ii) Homogeneity attack [25 26] this attack takes advan-tage of the fact that ldquo119870-anonymity creates groupsthat leak information due to lack of diversity in thesensitive attributerdquo [25] The attack expects that thesearched attribute (sensitive attribute) is the same forall the users within the sameCRAn example showingthis attack is implemented as follows consider themedical records of a certain population shown inTable 1 and its anonymized version published inTable 2 [25] Alice tries to find the medical situationof Bob who is included in Table 1 Bob is a 31-year-old Americanmale who lives in ZIP code 13053 Alicecan deduce that Bob has cancer since all the membersbelonging to his group (CR = 9 10 11 12) have thesame disease

Table 1 Medical records of a sample population

Nonsensitive (quasi-identifiers) SensitiveZIP code Age Nationality Situation

1 13053 28 Russian Heart disease2 13068 29 American Heart disease3 13068 21 Japanese Viral infection4 13053 23 American Viral infection5 14853 50 Indian Cancer6 14853 55 Russian Heart disease7 14850 47 American Viral infection8 14850 49 American Viral infection9 13053 31 American Cancer10 13053 37 Indian Cancer11 13068 36 Japanese Cancer12 13068 35 American Cancer

(iii) Background knowledge attack this attack takesadvantage of the fact that ldquo119870-anonymity does not pro-tect against attacks based on background knowledgerdquo[25] which are not included in the offered informa-tion An example showing this attack is implementedas follows Alice tries to know the medical situationof her friend Umeko who is admitted to the samehospital as Bob so his records are also shown inTable 1 Umeko is a 21-year-old Japanese female whocurrently lives in ZIP code 13068 thus belonging tothe first CR = 1 2 3 4 Alice can deduce thatUmeko has either viral infection or heart disease It iswell known that the Japanese have an extremely lowincidence of heart disease thus Alice is nearly surethat Umeko has a viral infection

As shown previously 119870-anonymity has various weak-nesses and it would be interesting if we are able to replaceit Advanced privacy notions have been proposed to replace119870-anonymity for microdata publishing It looks tempting toadopt these notions into LBS In the coming section we aregoing to show that none of the advanced privacy notionsfound in the literature except for119870-anonymity is suitable forcrowdsourced location-based services

3 119870-Anonymity 119897-Diversity and 119905-Closeness

119870-anonymity is considered not enough to ensure the privacyneeded for microdata publishing Microdata are tables thatcontain unaggregated information which include medicalvoter registration census and customer data They areusually used as a valuable source of information for theallocation of public funds medical research and trendanalysis Table 1 is a sample microdata table while Table 2is an anonymized microdata table Other stronger privacynotions were proposed in the literature which are 119897-diversity[25] and 119905-closeness [26 28]

The ldquo119897-diversityrdquo privacy notion states that a CR is consid-ered to have 119897-diversity if there are at least 119897 ldquowell-representedrdquovalues [25] ldquoWell-representedrdquo can be interpreted as follows

6 Mobile Information Systems

Table 2 Anonymized records of the sample population

Nonsensitive (quasi-identifiers) SensitiveZIP code Age Nationality Situation

1 130lowastlowast lt30 lowast Heart disease2 130lowastlowast lt30 lowast Heart disease3 130lowastlowast lt30 lowast Viral infection4 130lowastlowast lt30 lowast Viral infection5 1485lowast gt40 lowast Cancer6 1485lowast gt40 lowast Heart disease7 1485lowast gt40 lowast Viral infection8 1485lowast gt40 lowast Viral infection9 130lowastlowast 3lowast lowast Cancer10 130lowastlowast 3lowast lowast Cancer11 130lowastlowast 3lowast lowast Cancer12 130lowastlowast 3lowast lowast Cancerlowast in the ZIP code column refers to missing data lowastlowastmeans that 2 digits areanonymized lowast in the Nationality column is used to identify an anonymizednationality

(i) Distinct 119897-diversity it contains at least 119897 distinctvalues

(ii) Entropy 119897-diversity entropy of the values thatoccurred is greater than or equal to log 119897

(iii) Recursive 119897-diversity the most frequent value doesnot appear toomuch and the less frequent value doesnot appear too rarely

A table is considered to have 119897-diversity if all its CRs have119897-diversity Various attacks have been proposed against thisprivacy notion which results in privacy disclosure [26]Theseattacks are as follows

(i) Skewness attack [26] skewed distributions couldsatisfy 119897-diversity but still do not prevent attributedisclosure This happens when the frequency ofoccurrence of a certain sensitive attribute inside aCR differs from its frequency in the whole table(population) Consider the drug addiction rate in acertain population to be 1 One of the classes hasan addiction rate of 50 This class is consideredattribute disclosing since a member of this class isidentified as addict with a high probability

(ii) Similarity attack [26] distinct but semantically simi-lar sensitive values could satisfy 119897-diversity but meanthe same for the attacker If the sensitive informationshown in a table contains cocaine addict heroinaddict marijuana addict and so forth as distinctentries it could be considered satisfying 119897-diversitybut for a drug dealer all these entries are consideredas a target

The ldquo119905-closenessrdquo concept is another privacy notion thatdifferentiates between the information gained about thewhole population and that about specific individuals [26]As the first gain is tolerated and motivated the second isconsidered privacy breaching A class is considered to have

Table 3 Data stored at the LBS anonymizer

Quasi-identifiers Sensitive dataCloaking region Location Query True sender

CR1Location 1

Query 1No

Location 2 NoLocation 3 Yes

CR2Location 4

Query 2No

Location 5 YesLocation 6 No

Table 4 Data captured by the adversary

Quasi-identifiers Sensitive dataCloaking region Location Query True sender

CR1Location 1

Query 1lowast

Location 2 lowast

Location 3 lowast

CR2Location 4

Query 2lowast

Location 5 lowast

Location 6 lowast

lowast refers to an anonymized Boolean value (true or false)

ldquo119905-closenessrdquo if the distance between the distribution of asensitive attribute in the studied class and the distributionin the whole table (population) is less than or equal to athreshold 119905 The distance is measured using Earth MoverrsquosDistance (EMD)

It is shown in [26] that ldquo119905-closenessrdquo ensures higherprivacy than ldquo119897-diversityrdquo

31 119870-Anonymity versus 119897-Diversity and 119905-Closeness Thedifference in nature between microdata tables and LBS CRcaptured data is the main factor that prevents the implemen-tation of ldquo119897-diversityrdquo and ldquo119905-closenessrdquo as privacy metrics inLPPMs

Anonymizing microdata keeps the sensitive attributesintact and hides parts of the quasi-identifiers to preventuniquely identifying an entity An entity can only be iden-tified as an unknown record in a class ldquo119897-diversityrdquo and ldquo119905-closenessrdquo try to keep the relationship between the entities ofa class and its sensitive attributes as private as possible usingthe techniques discussed previously

LBS anonymization tries on the contrary to hide thesensitive data instead of generalizing the quasi-identifierswhich are needed by the location-based server (locationexpertise and query) to be able to respond to the userrsquos queryTable 3 represents the LBS data stored in the anonymizer

The anonymized data that are transmitted to the LBSserver and could be captured by the adversary are shown inTable 4

As can be seen in the microdata anonymization in Tables1 and 2 and LBS anonymization in Tables 3 and 4 the latterfocuses on hiding the sensitive data which totally negates theaim behind publishing microdata

Mobile Information Systems 7

Based on the above discussion we can deduce that 119897-diversity and 119905-closeness are not suitable for LBS anonymiza-tion thus 119870-anonymity remains the best suitable privacynotion suitable for such services

We have shown in the previous sections that LBS appli-cations are in need of privacy preserving mechanisms basedon valid privacy notions We have also shown that thealready implemented privacy notion119870-anonymity could beexploited using various attacks but stricter privacy notionsare not suitable We can conclude that LPPMs using 119870-anonymity are the best available solutions To prove the needfor a novel privacy concept we will present in the comingsections a newattack on119870-anonymity thatwehave developed[29]

4 Mathematical Model

We will prove in this section that 119870-anonymity is private ifldquozero prior knowledgerdquo is assumed and the adversary couldnot perform the attack for a long continuous period

Let 1199091 1199092 119909

119899 be the set of users found in a CR

where ldquo119899rdquo is its size and 119899 ge 1 Let ldquo119909rdquo be our investigateduser If 119909 notin 119909

1 1199092 119909

119899 that is 119909 notin CR then 119909 is for sure

not the requestor The probability of a user being added to aCR without being the true requestor is

119875 (119909 isin CR (119909119894= requestor amp 119909 = 119909

119894))

=119875 (119909 isin CR cap 119909

119894= requestor)

119875 (119909119894= requestor)

=119899 minus 1

119873 minus 1

(1)

where119873 is the number of users in the area controlled by thestudied anonymizer where 1 le 119899 le 119873 The probability of auser being added to a CR while being the true requestor is

119875 (119909 isin CR (119909 = requestor)) = 1 (2)

We can deduce that the probability of a user being addedto a CR is119875 (119909 isin CR)

=

119873

sum

119894=1

119875 (119909 isin CR119909119894= requestor) sdot 119875 (119909

119894= requestor)

(3)

If all the users have the same frequency of sendingrequests (ie 119909

119894are equiprobable ie 119875(119909

119894= requestor) =

1119873) we can deduce that119875 (119909 = requestor119909 isin CR)

=119875 (119909 = requestor119909 isin CR)

119875 (119909 isin CR)=

(1119873)

(119899119873)=

1

119899

(4)

Based on the assumption that all users are equiprobable(since the adversary has ldquozero prior knowledgerdquo and could notperform the attack for a long continuous period he cannotassume otherwise) 119870-anonymity is satisfied if the cloakingregion size is greater than119870 that is 119899 ge 119870 We will prove inthe coming section that the privacy offered by119870-anonymitycould be breached using our proposed attack (frequencyattack) if the adversary is able to perform the attack for a longperiod even if ldquozero prior knowledgerdquo is assumed

5 Frequency Attack

Until now we have considered that all users are equiprobablebut in reality they are not Consider an LBS which can beused by a user to retrieve the location of the nearest shop(bakery Chinese restaurant etc) While being home I amaware of the most important places which I usually accessand I know the bakeries barbers supermarkets and so forthwithin my town so I need no support in locating these placeswhile on the contrary I can help othersWhen visiting places Iamnew to it will be hard forme to find certain targets (metrosupermarket etc) especially when facing language problemsMy rate of asking for guidance at these new places is muchhigher than that while being atmy known region Let comfortzone be a set of areas where a certain user is familiar with andneeds no support from the studied LBS in finding places Auserrsquos comfort zone includes his hometown and workplaceOutside this comfort zone the query frequency of this userincreases

Based on the above discussion we can easily assumethat the usersrsquo frequencies are not equal that is exist119894119875(119909

119894=

requestor) = 1119873If a user ldquo119894rdquo is to be found in a CR then either he is the real

requestor or he is not and selected from the ldquo119873minus1rdquo remainingusers

If user ldquo119894rdquo is the real requestor (ie 119894 = 119909) then 119875(119909 isin

CR119909 = requestor) = 1 since the real requestor should be inthe CR else he will not receive a valid answer

If user ldquo119894rdquo is not the real requestor (ie 119894 = 119909) then 119875(119909 isin

CR119909 = requestor) = (119899minus1)(119873minus1) that is one of the ldquo119899minus1rdquoselected users from the ldquo119873 minus 1rdquo remaining users

Let 119875119894= 119875(119909

119894= requestor) then

119875 (119909 isin CR)

=

119873

sum

119894=1

119875 (119909 isin CR119909119894= requestor) sdot 119875 (119909

119894= requestor)

=

119873

sum

119894=1

119894 =119909

(119899 minus 1

119873 minus 1)119875119894+ (1) 119875

119909= (

119899 minus 1

119873 minus 1)

119873

sum

119894=1

119894 =119909

119875119894+ 119875119909

Since119873

sum

119894=1

119875119894= 1 then

119873

sum

119894=1

119894 =119909

119875119894= 1 minus 119875

119909

(5)

then

119875 (119909 isin CR) = (119899 minus 1

119873 minus 1) (1 minus 119875

119909) + 119875119909

=119899 minus 1

119873 minus 1+ (

119873 minus 119899

119873 minus 1)119875119909

(6)

let

119886 = (119873 minus 119899

119873 minus 1)

119887 = (119899 minus 1

119873 minus 1)

(7)

8 Mobile Information Systems

then

119875 (119909119894isin CR) = 119886119875

119894+ 119887 (8)

The adversary even if he has ldquozero prior knowledgerdquocan collect CRs used by the studied application through asilent passive attack These captured CRs help the attackerto estimate the requesting frequency of each user using thefollowing procedure

Let 119909119894119895be a random variable representing the occurrence

of user ldquo119894rdquo in the 119895th collected CR (CR119895) sdot 119909119894119895= 1 if user ldquo119894rdquo

(119909119894) is found in CR

119895and 119909

119894119895= 0 otherwise Considering that

the attacker has captured ldquo119898rdquo CRs the average number ofoccurrences of user ldquo119894rdquo in the collected CRs is

119883119894=

1

119898

119898

sum

119895=1

119909119894119895 (9)

We can estimate the population rate (the real rate of user119894) from the sample rate with a confidence interval as follows

119875 (119909119894isin CR) isin [119883

119894minus 120572119878radic

119898 minus 1

119898 119883119894+ 120572119878radic

119898 minus 1

119898] 997904rArr

119875 (119909119894isin CR)

isin

[119883119894minus 119905120572

119878radic119898 minus 1

119898 119883119894+ 119905120572

119878radic119898 minus 1

119898] if 119898 lt 30

[119883119894minus

120572119878

radic119898 119883119894+

120572119878

radic119898] elsewhere

(10)

ldquo119905rdquo follows Studentrsquos 119905-distribution of ldquo119898 minus 1rdquo degreesof freedom (using Bayesian prediction) and ldquo120572rdquo followsnormal distribution (using central limit theorem) We cannow deduce 119875

119894since

997904rArr 119875 (119909 = requestor119909 isin CR) =119875119909

sum119895isinCR 119875119895

=1

119899 (11)

We have shown in this section that even with ldquozeroprior knowledgerdquo the adversary is able to estimate the userrsquosrequest rate through a passive attack aiming to collect CRsThis attack takes advantage of the fact that not all users havethe same request rate that is each put a different amount oftime utilizing a certain application User request rate escalateswhen being outside of his comfort zone which helps indifferentiating him from other users with lower request rateAs the duration of this attack increases the attackerrsquos accuracyincreases Using the above estimations the adversary is ableto identify whether a user found in a CR is the real requestorwith a probability different than (greater or less than) ldquo1119899rdquowhich results in lower entropy (uncertainty) and thus betterpredictability and this breaches the119870-anonymity constraint

We will show in the coming section the entropy and rateof successful prediction of our proposed attack

6 Simulation Results

In this section we have simulated using a specially craftedC code 50 users within an anonymizerrsquos controlled zone

Rate of successful prediction using frequency attack

0

5

10

15

20

25

30

35

40

()

20 50 7010()

Estimated rate of successful prediction using K-anonymity

Figure 6 Prediction success rate for CR = 5

Rate of successful prediction using frequency attack

0

10

20

30

40

50

()

20 50 7010()

Estimated rate of successful prediction using K-anonymity

Figure 7 Prediction success rate for CR = 4

Each user has either low or high request rates Users with highrequest rate are considered outside their comfort zone andhave 10 times more requests than low rate users The privacylevel (entropy and rate of successful requestor prediction) isstudied for different CR sizes and for sufficient samples Thetraffic factor (high request ratelow request rate = 10) is notproved on any real application but assumed as a particularcase

In Figure 6 we can see the estimated rate of successfulprediction and the rates achieved by the frequency attack for acloaking region of size 5The119909-axis represents the proportionof the population with high request rates while the 119910-axisrepresents the average probability of predicting the requestorsuccessfully

It can be shown in Figure 6 that frequency attack hasincreased the prediction success compared to the estimatedrate by (30-20)20 = 50 when 70 of the populationare outside their comfort zone to (36-20)20 = 60when 20 of the population are outside their comfort zonedepending on the distribution of the requestors (percentageof users outside their comfort zone) In Figure 7 we cansee the estimated rate of successful prediction and the ratesachieved by the frequency attack for a cloaking region of size4 The 119909-axis and 119910-axis are similar to those in Figure 6

Mobile Information Systems 9

Entropy of the samples with CR = 4 using frequency attackEntropy of the samples with CR = 5 using frequency attack

3

32

34

36

38

4

20 50 7010()

Entropy of equiprobable users as estimated by K-anonymity

Figure 8 Entropy for frequency attack (CR = 4 CR = 5) andequiprobable users

It can be shown in Figure 7 that frequency attack hasincreased the prediction success compared to the estimatedrate by (34-25)25 = 30 when 70 of the populationare outside their comfort zone to (44-25)25 = 20when 20 of the population are outside their comfort zonedepending on the distribution of the requestors (percentageof users outside their comfort zone) In Figure 8 we cansee the entropy (uncertainty) estimated by 119870-anonymityThe 119909-axis represents the proportion of the population withhigh request rates while the 119910-axis represents mechanismrsquosentropy

It can be shown in Figure 8 that the entropy (uncertaintyof knowing the requestor) is lower when using frequencyattack for both CR sizes This means that the attacker is ableto estimate the real requestor with a probability greater thanldquo1119899rdquo which leads to a breach in the ldquo119870-anonymityrdquo privacyconstraint

We have shown in this section that our proposed attackis able to breach119870-anonymity with an incremental rate as theduration of this attack increases

Since we are not able to find any privacy notion that iscapable of ensuring elevated user privacy in crowdsourcedLBS we deduce that a solid privacy solution is needed andthis will be proposed in the coming section

7 The Proposed Solution

Ensuring high privacy level in crowdsourced LBS is not trivialin the following environments (attacker models)

(i) Untrusted service provider in most of the cases theuser has no signed contract with the crowdsourcedlocation-based server thus no privacy obligations areheld especially when no auditing is taking place

(ii) Hostile network crowdsourced LBSs have no meansof managing incoming trafficrsquos privacy level otherthan encryption

(iii) Untrusted service provider in hostile network mostof the LBSs are untrusted and connected to the Inter-net (hostile network) but enforce encrypted trafficThis attackermodel which is currently used results intwo attack categories (insider and neighbor) as shownin Figure 3 Figure 9 shows untrusted service providerin a hostile network

The best privacy can be achieved by a trusted serviceprovider when located within a trusted network In this casethere is no need to use LPPMs or anonymization since theattacker is neither able to access the sent queries (neighborattack) nor able to access the log files (insider attack)Any of the LPPMs found in the literature and surveyed inSection 2 generates both processing and traffic overheadthus eliminating the need for these mechanisms withoutaffecting the privacy level is definitely an achievement interms of security and performance

Since crowdsourced LBS users are in reality mobile usersour proposed solution for this privacy issue is based onour Mobile Cloud Computing architecture named OperatorCentric Mobile Cloud Architecture (OCMCA) [30] We aregoing next to describe briefly OCMCA

71 Operator Centric Mobile Cloud Architecture To decreasethe delay cost and power consumption and increase theprivacy mobility and scalability compared to other mobileCloud architectures we proposed to install a ldquoCloud serverrdquowithin the mobile operatorrsquos network as shown in Figure 10Its proposed position leads to the following

(i) Less delay the traffic does not need to pass throughthe Internet to reach the destined server

(ii) Less cost the same data contents could be deliv-ered to various users using 1 multicast channelwhich decreases congestion (scalability) and allowedcheaper pricing models

(iii) Less power consumption userrsquos jobs are computed atthe Cloud server which eliminates the need for in-house execution

(iv) More mobility a user can maintain a session with theCloud server as long as he is connected to the mobilenetwork Our mobile Cloud architecture benefitsfrom the handover mechanisms in place

(v) More scalability the Cloud servers are centralizedand accessible by all the users

(vi) More privacy the Cloud servers are managed by atrusted and secure entity as it will be shown in moredetail in the next subsection

Since users are usually connected with the same operatorfor long durations we recommend the userrsquos CSP to federatesome resources at the ldquoCloud serverrdquo and then offload theuserrsquos applications and environment settings In this caseall computation-intensive processing is implemented within

10 Mobile Information Systems

Serving GWPDN GW

Cellphone (mobile)eNodeB

MME

LBS

Hostile network Untrusted service provider

Internet

Figure 9 Untrusted service provider in hostile network

Serving GW

MBMS GW

BM-SC

PDN GW

Cloud server

Cellphone (mobile)

Smartphone (mobile)

eNodeB

MME

Trusted network Server under trusted monitoring (trusted server)

Figure 10 OCMCA

the mobile operatorrsquos network and the terminal generateddata are offloaded to the local Cloud without the need toaccess the Internet (which decreases the cost per bit ofthe transmitted data) This Cloud should be able to triggerbroadcast messages when needed 3GPP has standardizedmulticast and broadcast packet transmission in UMTS andbroadcast transmission in LTE through a feature namedMultimedia MulticastBroadcast Service (MBMS) which wasdefined in 3GPPrsquos technical specification as follows

(i) ldquoMBMS is a point-to-multipoint service inwhich datais transmitted from a single source entity to multiplerecipientsrdquo [30]

(ii) Physical broadcasting allows network resources to beshared when transmitting the same data to multiplerecipients [30]

(iii) Its architecture ensures efficient usage of radio-network and core-network resources especially inradio interface [30]

MBMS requires the introduction of two nodes to theEvolved Packet Core (EPC) network [30] which are asfollows

(i) MBMS GW which is responsible for the connectionswith content owners Cloud servers in our case

(ii) BM-SC (Broadcast Multicast Service Center) whichprovides a set of functions for MBMS user services

If MBMS is already enabled in a studied mobile operatorthen MBMS GW and BM-SC are already installed Enablingbroadcast in LTE allows the local Cloud located within themobile operatorrsquos premises to transmit data efficiently to theusers of one or more cells

Mobile Information Systems 11

InternetCellphone(mobile)

Smartphone(mobile)

Destination web callee at other operators

normal Cloud server and so fortheNodeB

Cloud server

BM-SC

Smartphone(mobile)

eNodeB

MBMS GW

MME

Serving GW

PDN GW

SGi

Smartphone(mobile)

Smartphone(mobile)

Normal mobile trafficUnicast Cloud trafficMulticast Cloud traffic

Figure 11 Traffic routes in OCMCA

The added nodes (MBMSGWBM-SC andCloud server)are transparent to normal mobile traffic Mobile traffic willpass as follows

(i) Normal mobile traffic eNodeB Serving GW andPacket Data Network GateWay (PDNGW) shown inFigure 11

(ii) Unicast Cloud traffic eNodeB Serving GW ldquoCloudserverrdquo Serving GW and eNodeB shown in Figure 11

(iii) Multicast Cloud traffic eNodeB Serving GW ldquoCloudserverrdquo BM-SC and MBMS GW shown in Figure 11

72 Privacy Guaranteeing Architecture One of the ldquoCloudserverrsquosrdquo capabilities is to offer Software-as-a-Service (SaaS)on behalf of application developers An application developercan request the operator to host his service (application)After reviewing the source code the operator agrees onhosting this service and signs an SLA with the developer

A similar protocol is implemented byApple before addingany application to Apple Store [31]The application developertrusts Apple for not spreading the source code and for billing(charging every downloaded instance)

In our case the LBS provider will delegate his applicationto be provided by the mobile operator on his behalf as shownin Figure 12

As can be seen in Figure 12 the LBS query does notneed to pass in a hostile network which eliminates neighborattacks Since the offered service is under the operatorrsquosmonitoring the provider becomes trusted and insider attacks

become eliminated The application verification and billingmethods suitable for the operator could be used withoutaffecting the proposed solution

Since LPPMs have been proposed to prevent an attackerwho captured LBS queries from identifying the requestor ortracing the users we can consider that LPPMs are not neededanymore in this scenario since attackers are not capable ofcapturing the transmitted requests Based on this discussionwe consider the following

(i) User identity privacy is maintained The operatorhas access to userrsquos real identity (IMSI MSISDNand name) and generates temporary identities (TMSIGUTI etc) to prevent real identity breach Userrsquosidentity privacy is maintained since the operator isconsidered a trusted entity but in all cases it alreadyhas access to this information thus our proposedsolution does not offer any new information that theoperator could benefit from

(ii) User location privacy is maintained The operatoralready has access to any userrsquos location and does notbenefit from additional sensitive information leadingto a transparent and strong location privacy solutionfor the user

8 Conclusion

User privacy in crowdsourced LBS is unacceptable in itscurrent state Available location privacy preserving mecha-nisms result in unnecessary overhead without being able to

12 Mobile Information Systems

Serving GW

MBMS GWBM-SC

PDN GW

Cloud server

Cellphone (mobile)

Smartphone (mobile)

eNodeB

MME

Trusted networkServer under trusted monitoring (trusted

server)

LBS

Delegatedservices

Figure 12 Crowdsourced LBS delegation

satisfy the needed requirementsWe have shown that even thestate-of-the-art approach such as119870-anonymity is not suitableto fulfill the required privacy needs as it is vulnerable toour defined frequency attack Thus a new privacy model isneeded and this is what we have proposed in this paper

The delegation of services into the Cloud server inside theoperatorrsquos trusted network prevents all the attacks proposedby the attacker model shown in Figure 3

Competing Interests

The authors declare that they have no competing interests

References

[1] Google Cloud httpsCloudgooglecom[2] A Bozzon M Brambilla and S Ceri ldquoAnswering search

queries with CrowdSearcherrdquo in Proceedings of the 21st AnnualConference on World Wide Web (WWW rsquo12) pp 1009ndash1018ACM Perth Australia April 2012

[3] B L Bayus ldquoCrowdsourcing new product ideas over timean analysis of the Dell IdeaStorm communityrdquo ManagementScience vol 59 no 1 pp 226ndash244 2013

[4] J Howe Crowdsourcing How the Power of the Crowd Is Drivingthe Future of Business Random House New York NY USA2008

[5] M F Bulut Y S Yilmaz and M Demirbas ldquoCrowdsourcinglocation-based queriesrdquo in Proceedings of the 9th IEEE Interna-tional Conference on Pervasive Computing and CommunicationsWorkshops (PERCOM Workshops rsquo11) pp 513ndash518 SeattleWash USA March 2011

[6] D Horowitz and S D Kamvar ldquoThe anatomy of a large-scalesocial search enginerdquo in Proceedings of the 19th InternationalWorld Wide Web Conference (WWW rsquo10) pp 431ndash440 RaleighNC USA April 2010

[7] G Chatzimilioudis A Konstantinidis C Laoudias and DZeinalipour-Yazti ldquoCrowdsourcing with smartphonesrdquo IEEEInternet Computing vol 16 no 5 pp 36ndash44 2012

[8] Chachacom httpwwwchachacom[9] Foursquarecom httpsfoursquarecom

[10] W S Lasecki RWesley J Nichols A Kulkarni J F Allen and JP Bigham ldquoChorus a crowd-powered conversational assistantrdquoin Proceedings of the 26th Annual ACM Symposium on UserInterface Software and Technology (UIST rsquo13) pp 151ndash162 StAndrews UK October 2013

[11] A Noulas S Scellato C Mascolo and M Pontil ldquoAn empiricalstudy of geographic user activity patterns in foursquarerdquo inProceedings of the International Conference on Weblogs andSocial Media AAAI Barcelona Spain July 2011

[12] How effective drug addiction treatment httpwwwdrugabusegovpublicationsprinciples-drug-addiction-treatment-re-search-based-guide-third-editionfrequently-asked-questionshow-effective-drug-addiction-treatment

[13] Drug Rehab Statistics httpwwwfuturesofpalmbeachcomdrug-rehabstatistics

[14] J Bou Abdo H Chaouchi and J Demerjian ldquoSecurity inemerging 4G networksrdquo in Next-Generation Wireless Technolo-gies 4G and Beyond pp 243ndash272 Springer London UK 2013

[15] Location-Based Service httpenwikipediaorgwikiLocation-based service

[16] M E Nergiz M Atzori and Y Saygin ldquoTowards trajectoryanonymization a generalization-based approachrdquo in Proceed-ings of the SIGSPATIAL ACM GIS International Workshop onSecurity and Privacy in GIS and LBS (SPRINGL rsquo08) pp 52ndash61November 2008

[17] S SteinigerMNeun andA Edwardes Foundations of LocationBased Services Lecture Notes on LBS v 10 2006

[18] D Quercia N Lathia F Calabrese G Di Lorenzo and JCrowcroft ldquoRecommending social events from mobile phonelocation datardquo in Proceedings of the IEEE 10th InternationalConference on Data Mining (ICDM rsquo 10) pp 971ndash976 IEEESydney Australia 2010

[19] B Jiang and X Yao ldquoLocation-based services and GIS inperspectiverdquo Computers Environment and Urban Systems vol30 no 6 pp 712ndash725 2006

[20] G Ghinita P Kalnis A Khoshgozaran C Shahabi and K-LTan ldquoPrivate queries in location based services anonymizersare not necessaryrdquo in Proceedings of the ACM SIGMOD Interna-tional Conference on Management of Data (SIGMOD rsquo08) pp121ndash132 ACM 2008

[21] G Ghinita P KalnisM Kantarcioglu and E Bertino ldquoA hybridtechnique for private location-based queries with database

Mobile Information Systems 13

protectionrdquo in Advances in Spatial and Temporal DatabasesN Mamoulis T Seidl T B Pedersen K Torp and I AssentEds vol 5644 of Lecture Notes in Computer Science pp 98ndash116Springer Berlin Germany 2009

[22] L Sweeney ldquo119896-anonymity a model for protecting privacyrdquoInternational Journal of Uncertainty Fuzziness and Knowledge-Based Systems vol 10 no 5 pp 557ndash570 2002

[23] Generation Partnership Project 3GPP T TR 33821 V900(2009-06) 3GPP Rationale and Track of Security Decisionsin Long Term Evolved (LTE) RAN3GPP System ArchitectureEvolution (SAE) (Release 9)

[24] J Bou Abdo J Demerjian and H Chaouchi ldquoSecurity VSQos for LTE authentication and key agreement protocolrdquoInternational Journal of Network Security amp Its Applications vol4 no 5 pp 71ndash82 2012

[25] A Machanavajjhala D Kifer J Gehrke and M Venkitasub-ramaniam ldquoL-diversity privacy beyond k-anonymityrdquo ACMTransactions on Knowledge Discovery from Data vol 1 no 1article 3 2007

[26] N Li T Li and S Venkatasubramanian ldquot-closeness privacybeyond k-anonymity and l-diversityrdquo in Proceedings of the IEEE23rd International Conference on Data Engineering pp 106ndash115Istanbul Turkey April 2007

[27] X Liu K Liu L Guo X Li and Y Fang ldquoA game-theoreticapproach for achieving k-anonymity in Location Based Ser-vicesrdquo in Proceedings of the 32nd IEEE Conference on ComputerCommunications (IEEE INFOCOM rsquo13) pp 2985ndash2993 TurinItaly April 2013

[28] D Rebollo-Monedero J Forne and J Domingo-Ferrer ldquoFromt-closeness-like privacy to postrandomization via informationtheoryrdquo IEEE Transactions on Knowledge and Data Engineeringvol 22 no 11 pp 1623ndash1636 2010

[29] J B Abdo J Demerjian H Chaouchi T Atechian and CBassil ldquoPrivacy using mobile cloud computingrdquo in Proceedingsof the 5th International Conference on Digital Information andCommunication Technology and Its Applications (DICTAP rsquo15)pp 178ndash182 IEEE Beirut Lebanon May 2015

[30] J B Abdo J Demerjian H Chaouchi K Barbar andG PujolleldquoOperator centric mobile cloud architecturerdquo in Proceedings ofthe IEEE Wireless Communications and Networking Conference(WCNC rsquo14) pp 2982ndash2987 IEEE Istanbul Turkey April 2014

[31] Submitting Your Apple to the Store httpsdeveloperApplecomlibraryiosdocumentationIDEsConceptualAppDistri-butionGuideSubmittingYourAppSubmittingYourApphtml

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 6: Research Article Extended Privacy in Crowdsourced Location ...downloads.hindawi.com/journals/misy/2016/7867206.pdf · popular application o ering crowdsourced location-based service

6 Mobile Information Systems

Table 2 Anonymized records of the sample population

Nonsensitive (quasi-identifiers) SensitiveZIP code Age Nationality Situation

1 130lowastlowast lt30 lowast Heart disease2 130lowastlowast lt30 lowast Heart disease3 130lowastlowast lt30 lowast Viral infection4 130lowastlowast lt30 lowast Viral infection5 1485lowast gt40 lowast Cancer6 1485lowast gt40 lowast Heart disease7 1485lowast gt40 lowast Viral infection8 1485lowast gt40 lowast Viral infection9 130lowastlowast 3lowast lowast Cancer10 130lowastlowast 3lowast lowast Cancer11 130lowastlowast 3lowast lowast Cancer12 130lowastlowast 3lowast lowast Cancerlowast in the ZIP code column refers to missing data lowastlowastmeans that 2 digits areanonymized lowast in the Nationality column is used to identify an anonymizednationality

(i) Distinct 119897-diversity it contains at least 119897 distinctvalues

(ii) Entropy 119897-diversity entropy of the values thatoccurred is greater than or equal to log 119897

(iii) Recursive 119897-diversity the most frequent value doesnot appear toomuch and the less frequent value doesnot appear too rarely

A table is considered to have 119897-diversity if all its CRs have119897-diversity Various attacks have been proposed against thisprivacy notion which results in privacy disclosure [26]Theseattacks are as follows

(i) Skewness attack [26] skewed distributions couldsatisfy 119897-diversity but still do not prevent attributedisclosure This happens when the frequency ofoccurrence of a certain sensitive attribute inside aCR differs from its frequency in the whole table(population) Consider the drug addiction rate in acertain population to be 1 One of the classes hasan addiction rate of 50 This class is consideredattribute disclosing since a member of this class isidentified as addict with a high probability

(ii) Similarity attack [26] distinct but semantically simi-lar sensitive values could satisfy 119897-diversity but meanthe same for the attacker If the sensitive informationshown in a table contains cocaine addict heroinaddict marijuana addict and so forth as distinctentries it could be considered satisfying 119897-diversitybut for a drug dealer all these entries are consideredas a target

The ldquo119905-closenessrdquo concept is another privacy notion thatdifferentiates between the information gained about thewhole population and that about specific individuals [26]As the first gain is tolerated and motivated the second isconsidered privacy breaching A class is considered to have

Table 3 Data stored at the LBS anonymizer

Quasi-identifiers Sensitive dataCloaking region Location Query True sender

CR1Location 1

Query 1No

Location 2 NoLocation 3 Yes

CR2Location 4

Query 2No

Location 5 YesLocation 6 No

Table 4 Data captured by the adversary

Quasi-identifiers Sensitive dataCloaking region Location Query True sender

CR1Location 1

Query 1lowast

Location 2 lowast

Location 3 lowast

CR2Location 4

Query 2lowast

Location 5 lowast

Location 6 lowast

lowast refers to an anonymized Boolean value (true or false)

ldquo119905-closenessrdquo if the distance between the distribution of asensitive attribute in the studied class and the distributionin the whole table (population) is less than or equal to athreshold 119905 The distance is measured using Earth MoverrsquosDistance (EMD)

It is shown in [26] that ldquo119905-closenessrdquo ensures higherprivacy than ldquo119897-diversityrdquo

31 119870-Anonymity versus 119897-Diversity and 119905-Closeness Thedifference in nature between microdata tables and LBS CRcaptured data is the main factor that prevents the implemen-tation of ldquo119897-diversityrdquo and ldquo119905-closenessrdquo as privacy metrics inLPPMs

Anonymizing microdata keeps the sensitive attributesintact and hides parts of the quasi-identifiers to preventuniquely identifying an entity An entity can only be iden-tified as an unknown record in a class ldquo119897-diversityrdquo and ldquo119905-closenessrdquo try to keep the relationship between the entities ofa class and its sensitive attributes as private as possible usingthe techniques discussed previously

LBS anonymization tries on the contrary to hide thesensitive data instead of generalizing the quasi-identifierswhich are needed by the location-based server (locationexpertise and query) to be able to respond to the userrsquos queryTable 3 represents the LBS data stored in the anonymizer

The anonymized data that are transmitted to the LBSserver and could be captured by the adversary are shown inTable 4

As can be seen in the microdata anonymization in Tables1 and 2 and LBS anonymization in Tables 3 and 4 the latterfocuses on hiding the sensitive data which totally negates theaim behind publishing microdata

Mobile Information Systems 7

Based on the above discussion we can deduce that 119897-diversity and 119905-closeness are not suitable for LBS anonymiza-tion thus 119870-anonymity remains the best suitable privacynotion suitable for such services

We have shown in the previous sections that LBS appli-cations are in need of privacy preserving mechanisms basedon valid privacy notions We have also shown that thealready implemented privacy notion119870-anonymity could beexploited using various attacks but stricter privacy notionsare not suitable We can conclude that LPPMs using 119870-anonymity are the best available solutions To prove the needfor a novel privacy concept we will present in the comingsections a newattack on119870-anonymity thatwehave developed[29]

4 Mathematical Model

We will prove in this section that 119870-anonymity is private ifldquozero prior knowledgerdquo is assumed and the adversary couldnot perform the attack for a long continuous period

Let 1199091 1199092 119909

119899 be the set of users found in a CR

where ldquo119899rdquo is its size and 119899 ge 1 Let ldquo119909rdquo be our investigateduser If 119909 notin 119909

1 1199092 119909

119899 that is 119909 notin CR then 119909 is for sure

not the requestor The probability of a user being added to aCR without being the true requestor is

119875 (119909 isin CR (119909119894= requestor amp 119909 = 119909

119894))

=119875 (119909 isin CR cap 119909

119894= requestor)

119875 (119909119894= requestor)

=119899 minus 1

119873 minus 1

(1)

where119873 is the number of users in the area controlled by thestudied anonymizer where 1 le 119899 le 119873 The probability of auser being added to a CR while being the true requestor is

119875 (119909 isin CR (119909 = requestor)) = 1 (2)

We can deduce that the probability of a user being addedto a CR is119875 (119909 isin CR)

=

119873

sum

119894=1

119875 (119909 isin CR119909119894= requestor) sdot 119875 (119909

119894= requestor)

(3)

If all the users have the same frequency of sendingrequests (ie 119909

119894are equiprobable ie 119875(119909

119894= requestor) =

1119873) we can deduce that119875 (119909 = requestor119909 isin CR)

=119875 (119909 = requestor119909 isin CR)

119875 (119909 isin CR)=

(1119873)

(119899119873)=

1

119899

(4)

Based on the assumption that all users are equiprobable(since the adversary has ldquozero prior knowledgerdquo and could notperform the attack for a long continuous period he cannotassume otherwise) 119870-anonymity is satisfied if the cloakingregion size is greater than119870 that is 119899 ge 119870 We will prove inthe coming section that the privacy offered by119870-anonymitycould be breached using our proposed attack (frequencyattack) if the adversary is able to perform the attack for a longperiod even if ldquozero prior knowledgerdquo is assumed

5 Frequency Attack

Until now we have considered that all users are equiprobablebut in reality they are not Consider an LBS which can beused by a user to retrieve the location of the nearest shop(bakery Chinese restaurant etc) While being home I amaware of the most important places which I usually accessand I know the bakeries barbers supermarkets and so forthwithin my town so I need no support in locating these placeswhile on the contrary I can help othersWhen visiting places Iamnew to it will be hard forme to find certain targets (metrosupermarket etc) especially when facing language problemsMy rate of asking for guidance at these new places is muchhigher than that while being atmy known region Let comfortzone be a set of areas where a certain user is familiar with andneeds no support from the studied LBS in finding places Auserrsquos comfort zone includes his hometown and workplaceOutside this comfort zone the query frequency of this userincreases

Based on the above discussion we can easily assumethat the usersrsquo frequencies are not equal that is exist119894119875(119909

119894=

requestor) = 1119873If a user ldquo119894rdquo is to be found in a CR then either he is the real

requestor or he is not and selected from the ldquo119873minus1rdquo remainingusers

If user ldquo119894rdquo is the real requestor (ie 119894 = 119909) then 119875(119909 isin

CR119909 = requestor) = 1 since the real requestor should be inthe CR else he will not receive a valid answer

If user ldquo119894rdquo is not the real requestor (ie 119894 = 119909) then 119875(119909 isin

CR119909 = requestor) = (119899minus1)(119873minus1) that is one of the ldquo119899minus1rdquoselected users from the ldquo119873 minus 1rdquo remaining users

Let 119875119894= 119875(119909

119894= requestor) then

119875 (119909 isin CR)

=

119873

sum

119894=1

119875 (119909 isin CR119909119894= requestor) sdot 119875 (119909

119894= requestor)

=

119873

sum

119894=1

119894 =119909

(119899 minus 1

119873 minus 1)119875119894+ (1) 119875

119909= (

119899 minus 1

119873 minus 1)

119873

sum

119894=1

119894 =119909

119875119894+ 119875119909

Since119873

sum

119894=1

119875119894= 1 then

119873

sum

119894=1

119894 =119909

119875119894= 1 minus 119875

119909

(5)

then

119875 (119909 isin CR) = (119899 minus 1

119873 minus 1) (1 minus 119875

119909) + 119875119909

=119899 minus 1

119873 minus 1+ (

119873 minus 119899

119873 minus 1)119875119909

(6)

let

119886 = (119873 minus 119899

119873 minus 1)

119887 = (119899 minus 1

119873 minus 1)

(7)

8 Mobile Information Systems

then

119875 (119909119894isin CR) = 119886119875

119894+ 119887 (8)

The adversary even if he has ldquozero prior knowledgerdquocan collect CRs used by the studied application through asilent passive attack These captured CRs help the attackerto estimate the requesting frequency of each user using thefollowing procedure

Let 119909119894119895be a random variable representing the occurrence

of user ldquo119894rdquo in the 119895th collected CR (CR119895) sdot 119909119894119895= 1 if user ldquo119894rdquo

(119909119894) is found in CR

119895and 119909

119894119895= 0 otherwise Considering that

the attacker has captured ldquo119898rdquo CRs the average number ofoccurrences of user ldquo119894rdquo in the collected CRs is

119883119894=

1

119898

119898

sum

119895=1

119909119894119895 (9)

We can estimate the population rate (the real rate of user119894) from the sample rate with a confidence interval as follows

119875 (119909119894isin CR) isin [119883

119894minus 120572119878radic

119898 minus 1

119898 119883119894+ 120572119878radic

119898 minus 1

119898] 997904rArr

119875 (119909119894isin CR)

isin

[119883119894minus 119905120572

119878radic119898 minus 1

119898 119883119894+ 119905120572

119878radic119898 minus 1

119898] if 119898 lt 30

[119883119894minus

120572119878

radic119898 119883119894+

120572119878

radic119898] elsewhere

(10)

ldquo119905rdquo follows Studentrsquos 119905-distribution of ldquo119898 minus 1rdquo degreesof freedom (using Bayesian prediction) and ldquo120572rdquo followsnormal distribution (using central limit theorem) We cannow deduce 119875

119894since

997904rArr 119875 (119909 = requestor119909 isin CR) =119875119909

sum119895isinCR 119875119895

=1

119899 (11)

We have shown in this section that even with ldquozeroprior knowledgerdquo the adversary is able to estimate the userrsquosrequest rate through a passive attack aiming to collect CRsThis attack takes advantage of the fact that not all users havethe same request rate that is each put a different amount oftime utilizing a certain application User request rate escalateswhen being outside of his comfort zone which helps indifferentiating him from other users with lower request rateAs the duration of this attack increases the attackerrsquos accuracyincreases Using the above estimations the adversary is ableto identify whether a user found in a CR is the real requestorwith a probability different than (greater or less than) ldquo1119899rdquowhich results in lower entropy (uncertainty) and thus betterpredictability and this breaches the119870-anonymity constraint

We will show in the coming section the entropy and rateof successful prediction of our proposed attack

6 Simulation Results

In this section we have simulated using a specially craftedC code 50 users within an anonymizerrsquos controlled zone

Rate of successful prediction using frequency attack

0

5

10

15

20

25

30

35

40

()

20 50 7010()

Estimated rate of successful prediction using K-anonymity

Figure 6 Prediction success rate for CR = 5

Rate of successful prediction using frequency attack

0

10

20

30

40

50

()

20 50 7010()

Estimated rate of successful prediction using K-anonymity

Figure 7 Prediction success rate for CR = 4

Each user has either low or high request rates Users with highrequest rate are considered outside their comfort zone andhave 10 times more requests than low rate users The privacylevel (entropy and rate of successful requestor prediction) isstudied for different CR sizes and for sufficient samples Thetraffic factor (high request ratelow request rate = 10) is notproved on any real application but assumed as a particularcase

In Figure 6 we can see the estimated rate of successfulprediction and the rates achieved by the frequency attack for acloaking region of size 5The119909-axis represents the proportionof the population with high request rates while the 119910-axisrepresents the average probability of predicting the requestorsuccessfully

It can be shown in Figure 6 that frequency attack hasincreased the prediction success compared to the estimatedrate by (30-20)20 = 50 when 70 of the populationare outside their comfort zone to (36-20)20 = 60when 20 of the population are outside their comfort zonedepending on the distribution of the requestors (percentageof users outside their comfort zone) In Figure 7 we cansee the estimated rate of successful prediction and the ratesachieved by the frequency attack for a cloaking region of size4 The 119909-axis and 119910-axis are similar to those in Figure 6

Mobile Information Systems 9

Entropy of the samples with CR = 4 using frequency attackEntropy of the samples with CR = 5 using frequency attack

3

32

34

36

38

4

20 50 7010()

Entropy of equiprobable users as estimated by K-anonymity

Figure 8 Entropy for frequency attack (CR = 4 CR = 5) andequiprobable users

It can be shown in Figure 7 that frequency attack hasincreased the prediction success compared to the estimatedrate by (34-25)25 = 30 when 70 of the populationare outside their comfort zone to (44-25)25 = 20when 20 of the population are outside their comfort zonedepending on the distribution of the requestors (percentageof users outside their comfort zone) In Figure 8 we cansee the entropy (uncertainty) estimated by 119870-anonymityThe 119909-axis represents the proportion of the population withhigh request rates while the 119910-axis represents mechanismrsquosentropy

It can be shown in Figure 8 that the entropy (uncertaintyof knowing the requestor) is lower when using frequencyattack for both CR sizes This means that the attacker is ableto estimate the real requestor with a probability greater thanldquo1119899rdquo which leads to a breach in the ldquo119870-anonymityrdquo privacyconstraint

We have shown in this section that our proposed attackis able to breach119870-anonymity with an incremental rate as theduration of this attack increases

Since we are not able to find any privacy notion that iscapable of ensuring elevated user privacy in crowdsourcedLBS we deduce that a solid privacy solution is needed andthis will be proposed in the coming section

7 The Proposed Solution

Ensuring high privacy level in crowdsourced LBS is not trivialin the following environments (attacker models)

(i) Untrusted service provider in most of the cases theuser has no signed contract with the crowdsourcedlocation-based server thus no privacy obligations areheld especially when no auditing is taking place

(ii) Hostile network crowdsourced LBSs have no meansof managing incoming trafficrsquos privacy level otherthan encryption

(iii) Untrusted service provider in hostile network mostof the LBSs are untrusted and connected to the Inter-net (hostile network) but enforce encrypted trafficThis attackermodel which is currently used results intwo attack categories (insider and neighbor) as shownin Figure 3 Figure 9 shows untrusted service providerin a hostile network

The best privacy can be achieved by a trusted serviceprovider when located within a trusted network In this casethere is no need to use LPPMs or anonymization since theattacker is neither able to access the sent queries (neighborattack) nor able to access the log files (insider attack)Any of the LPPMs found in the literature and surveyed inSection 2 generates both processing and traffic overheadthus eliminating the need for these mechanisms withoutaffecting the privacy level is definitely an achievement interms of security and performance

Since crowdsourced LBS users are in reality mobile usersour proposed solution for this privacy issue is based onour Mobile Cloud Computing architecture named OperatorCentric Mobile Cloud Architecture (OCMCA) [30] We aregoing next to describe briefly OCMCA

71 Operator Centric Mobile Cloud Architecture To decreasethe delay cost and power consumption and increase theprivacy mobility and scalability compared to other mobileCloud architectures we proposed to install a ldquoCloud serverrdquowithin the mobile operatorrsquos network as shown in Figure 10Its proposed position leads to the following

(i) Less delay the traffic does not need to pass throughthe Internet to reach the destined server

(ii) Less cost the same data contents could be deliv-ered to various users using 1 multicast channelwhich decreases congestion (scalability) and allowedcheaper pricing models

(iii) Less power consumption userrsquos jobs are computed atthe Cloud server which eliminates the need for in-house execution

(iv) More mobility a user can maintain a session with theCloud server as long as he is connected to the mobilenetwork Our mobile Cloud architecture benefitsfrom the handover mechanisms in place

(v) More scalability the Cloud servers are centralizedand accessible by all the users

(vi) More privacy the Cloud servers are managed by atrusted and secure entity as it will be shown in moredetail in the next subsection

Since users are usually connected with the same operatorfor long durations we recommend the userrsquos CSP to federatesome resources at the ldquoCloud serverrdquo and then offload theuserrsquos applications and environment settings In this caseall computation-intensive processing is implemented within

10 Mobile Information Systems

Serving GWPDN GW

Cellphone (mobile)eNodeB

MME

LBS

Hostile network Untrusted service provider

Internet

Figure 9 Untrusted service provider in hostile network

Serving GW

MBMS GW

BM-SC

PDN GW

Cloud server

Cellphone (mobile)

Smartphone (mobile)

eNodeB

MME

Trusted network Server under trusted monitoring (trusted server)

Figure 10 OCMCA

the mobile operatorrsquos network and the terminal generateddata are offloaded to the local Cloud without the need toaccess the Internet (which decreases the cost per bit ofthe transmitted data) This Cloud should be able to triggerbroadcast messages when needed 3GPP has standardizedmulticast and broadcast packet transmission in UMTS andbroadcast transmission in LTE through a feature namedMultimedia MulticastBroadcast Service (MBMS) which wasdefined in 3GPPrsquos technical specification as follows

(i) ldquoMBMS is a point-to-multipoint service inwhich datais transmitted from a single source entity to multiplerecipientsrdquo [30]

(ii) Physical broadcasting allows network resources to beshared when transmitting the same data to multiplerecipients [30]

(iii) Its architecture ensures efficient usage of radio-network and core-network resources especially inradio interface [30]

MBMS requires the introduction of two nodes to theEvolved Packet Core (EPC) network [30] which are asfollows

(i) MBMS GW which is responsible for the connectionswith content owners Cloud servers in our case

(ii) BM-SC (Broadcast Multicast Service Center) whichprovides a set of functions for MBMS user services

If MBMS is already enabled in a studied mobile operatorthen MBMS GW and BM-SC are already installed Enablingbroadcast in LTE allows the local Cloud located within themobile operatorrsquos premises to transmit data efficiently to theusers of one or more cells

Mobile Information Systems 11

InternetCellphone(mobile)

Smartphone(mobile)

Destination web callee at other operators

normal Cloud server and so fortheNodeB

Cloud server

BM-SC

Smartphone(mobile)

eNodeB

MBMS GW

MME

Serving GW

PDN GW

SGi

Smartphone(mobile)

Smartphone(mobile)

Normal mobile trafficUnicast Cloud trafficMulticast Cloud traffic

Figure 11 Traffic routes in OCMCA

The added nodes (MBMSGWBM-SC andCloud server)are transparent to normal mobile traffic Mobile traffic willpass as follows

(i) Normal mobile traffic eNodeB Serving GW andPacket Data Network GateWay (PDNGW) shown inFigure 11

(ii) Unicast Cloud traffic eNodeB Serving GW ldquoCloudserverrdquo Serving GW and eNodeB shown in Figure 11

(iii) Multicast Cloud traffic eNodeB Serving GW ldquoCloudserverrdquo BM-SC and MBMS GW shown in Figure 11

72 Privacy Guaranteeing Architecture One of the ldquoCloudserverrsquosrdquo capabilities is to offer Software-as-a-Service (SaaS)on behalf of application developers An application developercan request the operator to host his service (application)After reviewing the source code the operator agrees onhosting this service and signs an SLA with the developer

A similar protocol is implemented byApple before addingany application to Apple Store [31]The application developertrusts Apple for not spreading the source code and for billing(charging every downloaded instance)

In our case the LBS provider will delegate his applicationto be provided by the mobile operator on his behalf as shownin Figure 12

As can be seen in Figure 12 the LBS query does notneed to pass in a hostile network which eliminates neighborattacks Since the offered service is under the operatorrsquosmonitoring the provider becomes trusted and insider attacks

become eliminated The application verification and billingmethods suitable for the operator could be used withoutaffecting the proposed solution

Since LPPMs have been proposed to prevent an attackerwho captured LBS queries from identifying the requestor ortracing the users we can consider that LPPMs are not neededanymore in this scenario since attackers are not capable ofcapturing the transmitted requests Based on this discussionwe consider the following

(i) User identity privacy is maintained The operatorhas access to userrsquos real identity (IMSI MSISDNand name) and generates temporary identities (TMSIGUTI etc) to prevent real identity breach Userrsquosidentity privacy is maintained since the operator isconsidered a trusted entity but in all cases it alreadyhas access to this information thus our proposedsolution does not offer any new information that theoperator could benefit from

(ii) User location privacy is maintained The operatoralready has access to any userrsquos location and does notbenefit from additional sensitive information leadingto a transparent and strong location privacy solutionfor the user

8 Conclusion

User privacy in crowdsourced LBS is unacceptable in itscurrent state Available location privacy preserving mecha-nisms result in unnecessary overhead without being able to

12 Mobile Information Systems

Serving GW

MBMS GWBM-SC

PDN GW

Cloud server

Cellphone (mobile)

Smartphone (mobile)

eNodeB

MME

Trusted networkServer under trusted monitoring (trusted

server)

LBS

Delegatedservices

Figure 12 Crowdsourced LBS delegation

satisfy the needed requirementsWe have shown that even thestate-of-the-art approach such as119870-anonymity is not suitableto fulfill the required privacy needs as it is vulnerable toour defined frequency attack Thus a new privacy model isneeded and this is what we have proposed in this paper

The delegation of services into the Cloud server inside theoperatorrsquos trusted network prevents all the attacks proposedby the attacker model shown in Figure 3

Competing Interests

The authors declare that they have no competing interests

References

[1] Google Cloud httpsCloudgooglecom[2] A Bozzon M Brambilla and S Ceri ldquoAnswering search

queries with CrowdSearcherrdquo in Proceedings of the 21st AnnualConference on World Wide Web (WWW rsquo12) pp 1009ndash1018ACM Perth Australia April 2012

[3] B L Bayus ldquoCrowdsourcing new product ideas over timean analysis of the Dell IdeaStorm communityrdquo ManagementScience vol 59 no 1 pp 226ndash244 2013

[4] J Howe Crowdsourcing How the Power of the Crowd Is Drivingthe Future of Business Random House New York NY USA2008

[5] M F Bulut Y S Yilmaz and M Demirbas ldquoCrowdsourcinglocation-based queriesrdquo in Proceedings of the 9th IEEE Interna-tional Conference on Pervasive Computing and CommunicationsWorkshops (PERCOM Workshops rsquo11) pp 513ndash518 SeattleWash USA March 2011

[6] D Horowitz and S D Kamvar ldquoThe anatomy of a large-scalesocial search enginerdquo in Proceedings of the 19th InternationalWorld Wide Web Conference (WWW rsquo10) pp 431ndash440 RaleighNC USA April 2010

[7] G Chatzimilioudis A Konstantinidis C Laoudias and DZeinalipour-Yazti ldquoCrowdsourcing with smartphonesrdquo IEEEInternet Computing vol 16 no 5 pp 36ndash44 2012

[8] Chachacom httpwwwchachacom[9] Foursquarecom httpsfoursquarecom

[10] W S Lasecki RWesley J Nichols A Kulkarni J F Allen and JP Bigham ldquoChorus a crowd-powered conversational assistantrdquoin Proceedings of the 26th Annual ACM Symposium on UserInterface Software and Technology (UIST rsquo13) pp 151ndash162 StAndrews UK October 2013

[11] A Noulas S Scellato C Mascolo and M Pontil ldquoAn empiricalstudy of geographic user activity patterns in foursquarerdquo inProceedings of the International Conference on Weblogs andSocial Media AAAI Barcelona Spain July 2011

[12] How effective drug addiction treatment httpwwwdrugabusegovpublicationsprinciples-drug-addiction-treatment-re-search-based-guide-third-editionfrequently-asked-questionshow-effective-drug-addiction-treatment

[13] Drug Rehab Statistics httpwwwfuturesofpalmbeachcomdrug-rehabstatistics

[14] J Bou Abdo H Chaouchi and J Demerjian ldquoSecurity inemerging 4G networksrdquo in Next-Generation Wireless Technolo-gies 4G and Beyond pp 243ndash272 Springer London UK 2013

[15] Location-Based Service httpenwikipediaorgwikiLocation-based service

[16] M E Nergiz M Atzori and Y Saygin ldquoTowards trajectoryanonymization a generalization-based approachrdquo in Proceed-ings of the SIGSPATIAL ACM GIS International Workshop onSecurity and Privacy in GIS and LBS (SPRINGL rsquo08) pp 52ndash61November 2008

[17] S SteinigerMNeun andA Edwardes Foundations of LocationBased Services Lecture Notes on LBS v 10 2006

[18] D Quercia N Lathia F Calabrese G Di Lorenzo and JCrowcroft ldquoRecommending social events from mobile phonelocation datardquo in Proceedings of the IEEE 10th InternationalConference on Data Mining (ICDM rsquo 10) pp 971ndash976 IEEESydney Australia 2010

[19] B Jiang and X Yao ldquoLocation-based services and GIS inperspectiverdquo Computers Environment and Urban Systems vol30 no 6 pp 712ndash725 2006

[20] G Ghinita P Kalnis A Khoshgozaran C Shahabi and K-LTan ldquoPrivate queries in location based services anonymizersare not necessaryrdquo in Proceedings of the ACM SIGMOD Interna-tional Conference on Management of Data (SIGMOD rsquo08) pp121ndash132 ACM 2008

[21] G Ghinita P KalnisM Kantarcioglu and E Bertino ldquoA hybridtechnique for private location-based queries with database

Mobile Information Systems 13

protectionrdquo in Advances in Spatial and Temporal DatabasesN Mamoulis T Seidl T B Pedersen K Torp and I AssentEds vol 5644 of Lecture Notes in Computer Science pp 98ndash116Springer Berlin Germany 2009

[22] L Sweeney ldquo119896-anonymity a model for protecting privacyrdquoInternational Journal of Uncertainty Fuzziness and Knowledge-Based Systems vol 10 no 5 pp 557ndash570 2002

[23] Generation Partnership Project 3GPP T TR 33821 V900(2009-06) 3GPP Rationale and Track of Security Decisionsin Long Term Evolved (LTE) RAN3GPP System ArchitectureEvolution (SAE) (Release 9)

[24] J Bou Abdo J Demerjian and H Chaouchi ldquoSecurity VSQos for LTE authentication and key agreement protocolrdquoInternational Journal of Network Security amp Its Applications vol4 no 5 pp 71ndash82 2012

[25] A Machanavajjhala D Kifer J Gehrke and M Venkitasub-ramaniam ldquoL-diversity privacy beyond k-anonymityrdquo ACMTransactions on Knowledge Discovery from Data vol 1 no 1article 3 2007

[26] N Li T Li and S Venkatasubramanian ldquot-closeness privacybeyond k-anonymity and l-diversityrdquo in Proceedings of the IEEE23rd International Conference on Data Engineering pp 106ndash115Istanbul Turkey April 2007

[27] X Liu K Liu L Guo X Li and Y Fang ldquoA game-theoreticapproach for achieving k-anonymity in Location Based Ser-vicesrdquo in Proceedings of the 32nd IEEE Conference on ComputerCommunications (IEEE INFOCOM rsquo13) pp 2985ndash2993 TurinItaly April 2013

[28] D Rebollo-Monedero J Forne and J Domingo-Ferrer ldquoFromt-closeness-like privacy to postrandomization via informationtheoryrdquo IEEE Transactions on Knowledge and Data Engineeringvol 22 no 11 pp 1623ndash1636 2010

[29] J B Abdo J Demerjian H Chaouchi T Atechian and CBassil ldquoPrivacy using mobile cloud computingrdquo in Proceedingsof the 5th International Conference on Digital Information andCommunication Technology and Its Applications (DICTAP rsquo15)pp 178ndash182 IEEE Beirut Lebanon May 2015

[30] J B Abdo J Demerjian H Chaouchi K Barbar andG PujolleldquoOperator centric mobile cloud architecturerdquo in Proceedings ofthe IEEE Wireless Communications and Networking Conference(WCNC rsquo14) pp 2982ndash2987 IEEE Istanbul Turkey April 2014

[31] Submitting Your Apple to the Store httpsdeveloperApplecomlibraryiosdocumentationIDEsConceptualAppDistri-butionGuideSubmittingYourAppSubmittingYourApphtml

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 7: Research Article Extended Privacy in Crowdsourced Location ...downloads.hindawi.com/journals/misy/2016/7867206.pdf · popular application o ering crowdsourced location-based service

Mobile Information Systems 7

Based on the above discussion we can deduce that 119897-diversity and 119905-closeness are not suitable for LBS anonymiza-tion thus 119870-anonymity remains the best suitable privacynotion suitable for such services

We have shown in the previous sections that LBS appli-cations are in need of privacy preserving mechanisms basedon valid privacy notions We have also shown that thealready implemented privacy notion119870-anonymity could beexploited using various attacks but stricter privacy notionsare not suitable We can conclude that LPPMs using 119870-anonymity are the best available solutions To prove the needfor a novel privacy concept we will present in the comingsections a newattack on119870-anonymity thatwehave developed[29]

4 Mathematical Model

We will prove in this section that 119870-anonymity is private ifldquozero prior knowledgerdquo is assumed and the adversary couldnot perform the attack for a long continuous period

Let 1199091 1199092 119909

119899 be the set of users found in a CR

where ldquo119899rdquo is its size and 119899 ge 1 Let ldquo119909rdquo be our investigateduser If 119909 notin 119909

1 1199092 119909

119899 that is 119909 notin CR then 119909 is for sure

not the requestor The probability of a user being added to aCR without being the true requestor is

119875 (119909 isin CR (119909119894= requestor amp 119909 = 119909

119894))

=119875 (119909 isin CR cap 119909

119894= requestor)

119875 (119909119894= requestor)

=119899 minus 1

119873 minus 1

(1)

where119873 is the number of users in the area controlled by thestudied anonymizer where 1 le 119899 le 119873 The probability of auser being added to a CR while being the true requestor is

119875 (119909 isin CR (119909 = requestor)) = 1 (2)

We can deduce that the probability of a user being addedto a CR is119875 (119909 isin CR)

=

119873

sum

119894=1

119875 (119909 isin CR119909119894= requestor) sdot 119875 (119909

119894= requestor)

(3)

If all the users have the same frequency of sendingrequests (ie 119909

119894are equiprobable ie 119875(119909

119894= requestor) =

1119873) we can deduce that119875 (119909 = requestor119909 isin CR)

=119875 (119909 = requestor119909 isin CR)

119875 (119909 isin CR)=

(1119873)

(119899119873)=

1

119899

(4)

Based on the assumption that all users are equiprobable(since the adversary has ldquozero prior knowledgerdquo and could notperform the attack for a long continuous period he cannotassume otherwise) 119870-anonymity is satisfied if the cloakingregion size is greater than119870 that is 119899 ge 119870 We will prove inthe coming section that the privacy offered by119870-anonymitycould be breached using our proposed attack (frequencyattack) if the adversary is able to perform the attack for a longperiod even if ldquozero prior knowledgerdquo is assumed

5 Frequency Attack

Until now we have considered that all users are equiprobablebut in reality they are not Consider an LBS which can beused by a user to retrieve the location of the nearest shop(bakery Chinese restaurant etc) While being home I amaware of the most important places which I usually accessand I know the bakeries barbers supermarkets and so forthwithin my town so I need no support in locating these placeswhile on the contrary I can help othersWhen visiting places Iamnew to it will be hard forme to find certain targets (metrosupermarket etc) especially when facing language problemsMy rate of asking for guidance at these new places is muchhigher than that while being atmy known region Let comfortzone be a set of areas where a certain user is familiar with andneeds no support from the studied LBS in finding places Auserrsquos comfort zone includes his hometown and workplaceOutside this comfort zone the query frequency of this userincreases

Based on the above discussion we can easily assumethat the usersrsquo frequencies are not equal that is exist119894119875(119909

119894=

requestor) = 1119873If a user ldquo119894rdquo is to be found in a CR then either he is the real

requestor or he is not and selected from the ldquo119873minus1rdquo remainingusers

If user ldquo119894rdquo is the real requestor (ie 119894 = 119909) then 119875(119909 isin

CR119909 = requestor) = 1 since the real requestor should be inthe CR else he will not receive a valid answer

If user ldquo119894rdquo is not the real requestor (ie 119894 = 119909) then 119875(119909 isin

CR119909 = requestor) = (119899minus1)(119873minus1) that is one of the ldquo119899minus1rdquoselected users from the ldquo119873 minus 1rdquo remaining users

Let 119875119894= 119875(119909

119894= requestor) then

119875 (119909 isin CR)

=

119873

sum

119894=1

119875 (119909 isin CR119909119894= requestor) sdot 119875 (119909

119894= requestor)

=

119873

sum

119894=1

119894 =119909

(119899 minus 1

119873 minus 1)119875119894+ (1) 119875

119909= (

119899 minus 1

119873 minus 1)

119873

sum

119894=1

119894 =119909

119875119894+ 119875119909

Since119873

sum

119894=1

119875119894= 1 then

119873

sum

119894=1

119894 =119909

119875119894= 1 minus 119875

119909

(5)

then

119875 (119909 isin CR) = (119899 minus 1

119873 minus 1) (1 minus 119875

119909) + 119875119909

=119899 minus 1

119873 minus 1+ (

119873 minus 119899

119873 minus 1)119875119909

(6)

let

119886 = (119873 minus 119899

119873 minus 1)

119887 = (119899 minus 1

119873 minus 1)

(7)

8 Mobile Information Systems

then

119875 (119909119894isin CR) = 119886119875

119894+ 119887 (8)

The adversary even if he has ldquozero prior knowledgerdquocan collect CRs used by the studied application through asilent passive attack These captured CRs help the attackerto estimate the requesting frequency of each user using thefollowing procedure

Let 119909119894119895be a random variable representing the occurrence

of user ldquo119894rdquo in the 119895th collected CR (CR119895) sdot 119909119894119895= 1 if user ldquo119894rdquo

(119909119894) is found in CR

119895and 119909

119894119895= 0 otherwise Considering that

the attacker has captured ldquo119898rdquo CRs the average number ofoccurrences of user ldquo119894rdquo in the collected CRs is

119883119894=

1

119898

119898

sum

119895=1

119909119894119895 (9)

We can estimate the population rate (the real rate of user119894) from the sample rate with a confidence interval as follows

119875 (119909119894isin CR) isin [119883

119894minus 120572119878radic

119898 minus 1

119898 119883119894+ 120572119878radic

119898 minus 1

119898] 997904rArr

119875 (119909119894isin CR)

isin

[119883119894minus 119905120572

119878radic119898 minus 1

119898 119883119894+ 119905120572

119878radic119898 minus 1

119898] if 119898 lt 30

[119883119894minus

120572119878

radic119898 119883119894+

120572119878

radic119898] elsewhere

(10)

ldquo119905rdquo follows Studentrsquos 119905-distribution of ldquo119898 minus 1rdquo degreesof freedom (using Bayesian prediction) and ldquo120572rdquo followsnormal distribution (using central limit theorem) We cannow deduce 119875

119894since

997904rArr 119875 (119909 = requestor119909 isin CR) =119875119909

sum119895isinCR 119875119895

=1

119899 (11)

We have shown in this section that even with ldquozeroprior knowledgerdquo the adversary is able to estimate the userrsquosrequest rate through a passive attack aiming to collect CRsThis attack takes advantage of the fact that not all users havethe same request rate that is each put a different amount oftime utilizing a certain application User request rate escalateswhen being outside of his comfort zone which helps indifferentiating him from other users with lower request rateAs the duration of this attack increases the attackerrsquos accuracyincreases Using the above estimations the adversary is ableto identify whether a user found in a CR is the real requestorwith a probability different than (greater or less than) ldquo1119899rdquowhich results in lower entropy (uncertainty) and thus betterpredictability and this breaches the119870-anonymity constraint

We will show in the coming section the entropy and rateof successful prediction of our proposed attack

6 Simulation Results

In this section we have simulated using a specially craftedC code 50 users within an anonymizerrsquos controlled zone

Rate of successful prediction using frequency attack

0

5

10

15

20

25

30

35

40

()

20 50 7010()

Estimated rate of successful prediction using K-anonymity

Figure 6 Prediction success rate for CR = 5

Rate of successful prediction using frequency attack

0

10

20

30

40

50

()

20 50 7010()

Estimated rate of successful prediction using K-anonymity

Figure 7 Prediction success rate for CR = 4

Each user has either low or high request rates Users with highrequest rate are considered outside their comfort zone andhave 10 times more requests than low rate users The privacylevel (entropy and rate of successful requestor prediction) isstudied for different CR sizes and for sufficient samples Thetraffic factor (high request ratelow request rate = 10) is notproved on any real application but assumed as a particularcase

In Figure 6 we can see the estimated rate of successfulprediction and the rates achieved by the frequency attack for acloaking region of size 5The119909-axis represents the proportionof the population with high request rates while the 119910-axisrepresents the average probability of predicting the requestorsuccessfully

It can be shown in Figure 6 that frequency attack hasincreased the prediction success compared to the estimatedrate by (30-20)20 = 50 when 70 of the populationare outside their comfort zone to (36-20)20 = 60when 20 of the population are outside their comfort zonedepending on the distribution of the requestors (percentageof users outside their comfort zone) In Figure 7 we cansee the estimated rate of successful prediction and the ratesachieved by the frequency attack for a cloaking region of size4 The 119909-axis and 119910-axis are similar to those in Figure 6

Mobile Information Systems 9

Entropy of the samples with CR = 4 using frequency attackEntropy of the samples with CR = 5 using frequency attack

3

32

34

36

38

4

20 50 7010()

Entropy of equiprobable users as estimated by K-anonymity

Figure 8 Entropy for frequency attack (CR = 4 CR = 5) andequiprobable users

It can be shown in Figure 7 that frequency attack hasincreased the prediction success compared to the estimatedrate by (34-25)25 = 30 when 70 of the populationare outside their comfort zone to (44-25)25 = 20when 20 of the population are outside their comfort zonedepending on the distribution of the requestors (percentageof users outside their comfort zone) In Figure 8 we cansee the entropy (uncertainty) estimated by 119870-anonymityThe 119909-axis represents the proportion of the population withhigh request rates while the 119910-axis represents mechanismrsquosentropy

It can be shown in Figure 8 that the entropy (uncertaintyof knowing the requestor) is lower when using frequencyattack for both CR sizes This means that the attacker is ableto estimate the real requestor with a probability greater thanldquo1119899rdquo which leads to a breach in the ldquo119870-anonymityrdquo privacyconstraint

We have shown in this section that our proposed attackis able to breach119870-anonymity with an incremental rate as theduration of this attack increases

Since we are not able to find any privacy notion that iscapable of ensuring elevated user privacy in crowdsourcedLBS we deduce that a solid privacy solution is needed andthis will be proposed in the coming section

7 The Proposed Solution

Ensuring high privacy level in crowdsourced LBS is not trivialin the following environments (attacker models)

(i) Untrusted service provider in most of the cases theuser has no signed contract with the crowdsourcedlocation-based server thus no privacy obligations areheld especially when no auditing is taking place

(ii) Hostile network crowdsourced LBSs have no meansof managing incoming trafficrsquos privacy level otherthan encryption

(iii) Untrusted service provider in hostile network mostof the LBSs are untrusted and connected to the Inter-net (hostile network) but enforce encrypted trafficThis attackermodel which is currently used results intwo attack categories (insider and neighbor) as shownin Figure 3 Figure 9 shows untrusted service providerin a hostile network

The best privacy can be achieved by a trusted serviceprovider when located within a trusted network In this casethere is no need to use LPPMs or anonymization since theattacker is neither able to access the sent queries (neighborattack) nor able to access the log files (insider attack)Any of the LPPMs found in the literature and surveyed inSection 2 generates both processing and traffic overheadthus eliminating the need for these mechanisms withoutaffecting the privacy level is definitely an achievement interms of security and performance

Since crowdsourced LBS users are in reality mobile usersour proposed solution for this privacy issue is based onour Mobile Cloud Computing architecture named OperatorCentric Mobile Cloud Architecture (OCMCA) [30] We aregoing next to describe briefly OCMCA

71 Operator Centric Mobile Cloud Architecture To decreasethe delay cost and power consumption and increase theprivacy mobility and scalability compared to other mobileCloud architectures we proposed to install a ldquoCloud serverrdquowithin the mobile operatorrsquos network as shown in Figure 10Its proposed position leads to the following

(i) Less delay the traffic does not need to pass throughthe Internet to reach the destined server

(ii) Less cost the same data contents could be deliv-ered to various users using 1 multicast channelwhich decreases congestion (scalability) and allowedcheaper pricing models

(iii) Less power consumption userrsquos jobs are computed atthe Cloud server which eliminates the need for in-house execution

(iv) More mobility a user can maintain a session with theCloud server as long as he is connected to the mobilenetwork Our mobile Cloud architecture benefitsfrom the handover mechanisms in place

(v) More scalability the Cloud servers are centralizedand accessible by all the users

(vi) More privacy the Cloud servers are managed by atrusted and secure entity as it will be shown in moredetail in the next subsection

Since users are usually connected with the same operatorfor long durations we recommend the userrsquos CSP to federatesome resources at the ldquoCloud serverrdquo and then offload theuserrsquos applications and environment settings In this caseall computation-intensive processing is implemented within

10 Mobile Information Systems

Serving GWPDN GW

Cellphone (mobile)eNodeB

MME

LBS

Hostile network Untrusted service provider

Internet

Figure 9 Untrusted service provider in hostile network

Serving GW

MBMS GW

BM-SC

PDN GW

Cloud server

Cellphone (mobile)

Smartphone (mobile)

eNodeB

MME

Trusted network Server under trusted monitoring (trusted server)

Figure 10 OCMCA

the mobile operatorrsquos network and the terminal generateddata are offloaded to the local Cloud without the need toaccess the Internet (which decreases the cost per bit ofthe transmitted data) This Cloud should be able to triggerbroadcast messages when needed 3GPP has standardizedmulticast and broadcast packet transmission in UMTS andbroadcast transmission in LTE through a feature namedMultimedia MulticastBroadcast Service (MBMS) which wasdefined in 3GPPrsquos technical specification as follows

(i) ldquoMBMS is a point-to-multipoint service inwhich datais transmitted from a single source entity to multiplerecipientsrdquo [30]

(ii) Physical broadcasting allows network resources to beshared when transmitting the same data to multiplerecipients [30]

(iii) Its architecture ensures efficient usage of radio-network and core-network resources especially inradio interface [30]

MBMS requires the introduction of two nodes to theEvolved Packet Core (EPC) network [30] which are asfollows

(i) MBMS GW which is responsible for the connectionswith content owners Cloud servers in our case

(ii) BM-SC (Broadcast Multicast Service Center) whichprovides a set of functions for MBMS user services

If MBMS is already enabled in a studied mobile operatorthen MBMS GW and BM-SC are already installed Enablingbroadcast in LTE allows the local Cloud located within themobile operatorrsquos premises to transmit data efficiently to theusers of one or more cells

Mobile Information Systems 11

InternetCellphone(mobile)

Smartphone(mobile)

Destination web callee at other operators

normal Cloud server and so fortheNodeB

Cloud server

BM-SC

Smartphone(mobile)

eNodeB

MBMS GW

MME

Serving GW

PDN GW

SGi

Smartphone(mobile)

Smartphone(mobile)

Normal mobile trafficUnicast Cloud trafficMulticast Cloud traffic

Figure 11 Traffic routes in OCMCA

The added nodes (MBMSGWBM-SC andCloud server)are transparent to normal mobile traffic Mobile traffic willpass as follows

(i) Normal mobile traffic eNodeB Serving GW andPacket Data Network GateWay (PDNGW) shown inFigure 11

(ii) Unicast Cloud traffic eNodeB Serving GW ldquoCloudserverrdquo Serving GW and eNodeB shown in Figure 11

(iii) Multicast Cloud traffic eNodeB Serving GW ldquoCloudserverrdquo BM-SC and MBMS GW shown in Figure 11

72 Privacy Guaranteeing Architecture One of the ldquoCloudserverrsquosrdquo capabilities is to offer Software-as-a-Service (SaaS)on behalf of application developers An application developercan request the operator to host his service (application)After reviewing the source code the operator agrees onhosting this service and signs an SLA with the developer

A similar protocol is implemented byApple before addingany application to Apple Store [31]The application developertrusts Apple for not spreading the source code and for billing(charging every downloaded instance)

In our case the LBS provider will delegate his applicationto be provided by the mobile operator on his behalf as shownin Figure 12

As can be seen in Figure 12 the LBS query does notneed to pass in a hostile network which eliminates neighborattacks Since the offered service is under the operatorrsquosmonitoring the provider becomes trusted and insider attacks

become eliminated The application verification and billingmethods suitable for the operator could be used withoutaffecting the proposed solution

Since LPPMs have been proposed to prevent an attackerwho captured LBS queries from identifying the requestor ortracing the users we can consider that LPPMs are not neededanymore in this scenario since attackers are not capable ofcapturing the transmitted requests Based on this discussionwe consider the following

(i) User identity privacy is maintained The operatorhas access to userrsquos real identity (IMSI MSISDNand name) and generates temporary identities (TMSIGUTI etc) to prevent real identity breach Userrsquosidentity privacy is maintained since the operator isconsidered a trusted entity but in all cases it alreadyhas access to this information thus our proposedsolution does not offer any new information that theoperator could benefit from

(ii) User location privacy is maintained The operatoralready has access to any userrsquos location and does notbenefit from additional sensitive information leadingto a transparent and strong location privacy solutionfor the user

8 Conclusion

User privacy in crowdsourced LBS is unacceptable in itscurrent state Available location privacy preserving mecha-nisms result in unnecessary overhead without being able to

12 Mobile Information Systems

Serving GW

MBMS GWBM-SC

PDN GW

Cloud server

Cellphone (mobile)

Smartphone (mobile)

eNodeB

MME

Trusted networkServer under trusted monitoring (trusted

server)

LBS

Delegatedservices

Figure 12 Crowdsourced LBS delegation

satisfy the needed requirementsWe have shown that even thestate-of-the-art approach such as119870-anonymity is not suitableto fulfill the required privacy needs as it is vulnerable toour defined frequency attack Thus a new privacy model isneeded and this is what we have proposed in this paper

The delegation of services into the Cloud server inside theoperatorrsquos trusted network prevents all the attacks proposedby the attacker model shown in Figure 3

Competing Interests

The authors declare that they have no competing interests

References

[1] Google Cloud httpsCloudgooglecom[2] A Bozzon M Brambilla and S Ceri ldquoAnswering search

queries with CrowdSearcherrdquo in Proceedings of the 21st AnnualConference on World Wide Web (WWW rsquo12) pp 1009ndash1018ACM Perth Australia April 2012

[3] B L Bayus ldquoCrowdsourcing new product ideas over timean analysis of the Dell IdeaStorm communityrdquo ManagementScience vol 59 no 1 pp 226ndash244 2013

[4] J Howe Crowdsourcing How the Power of the Crowd Is Drivingthe Future of Business Random House New York NY USA2008

[5] M F Bulut Y S Yilmaz and M Demirbas ldquoCrowdsourcinglocation-based queriesrdquo in Proceedings of the 9th IEEE Interna-tional Conference on Pervasive Computing and CommunicationsWorkshops (PERCOM Workshops rsquo11) pp 513ndash518 SeattleWash USA March 2011

[6] D Horowitz and S D Kamvar ldquoThe anatomy of a large-scalesocial search enginerdquo in Proceedings of the 19th InternationalWorld Wide Web Conference (WWW rsquo10) pp 431ndash440 RaleighNC USA April 2010

[7] G Chatzimilioudis A Konstantinidis C Laoudias and DZeinalipour-Yazti ldquoCrowdsourcing with smartphonesrdquo IEEEInternet Computing vol 16 no 5 pp 36ndash44 2012

[8] Chachacom httpwwwchachacom[9] Foursquarecom httpsfoursquarecom

[10] W S Lasecki RWesley J Nichols A Kulkarni J F Allen and JP Bigham ldquoChorus a crowd-powered conversational assistantrdquoin Proceedings of the 26th Annual ACM Symposium on UserInterface Software and Technology (UIST rsquo13) pp 151ndash162 StAndrews UK October 2013

[11] A Noulas S Scellato C Mascolo and M Pontil ldquoAn empiricalstudy of geographic user activity patterns in foursquarerdquo inProceedings of the International Conference on Weblogs andSocial Media AAAI Barcelona Spain July 2011

[12] How effective drug addiction treatment httpwwwdrugabusegovpublicationsprinciples-drug-addiction-treatment-re-search-based-guide-third-editionfrequently-asked-questionshow-effective-drug-addiction-treatment

[13] Drug Rehab Statistics httpwwwfuturesofpalmbeachcomdrug-rehabstatistics

[14] J Bou Abdo H Chaouchi and J Demerjian ldquoSecurity inemerging 4G networksrdquo in Next-Generation Wireless Technolo-gies 4G and Beyond pp 243ndash272 Springer London UK 2013

[15] Location-Based Service httpenwikipediaorgwikiLocation-based service

[16] M E Nergiz M Atzori and Y Saygin ldquoTowards trajectoryanonymization a generalization-based approachrdquo in Proceed-ings of the SIGSPATIAL ACM GIS International Workshop onSecurity and Privacy in GIS and LBS (SPRINGL rsquo08) pp 52ndash61November 2008

[17] S SteinigerMNeun andA Edwardes Foundations of LocationBased Services Lecture Notes on LBS v 10 2006

[18] D Quercia N Lathia F Calabrese G Di Lorenzo and JCrowcroft ldquoRecommending social events from mobile phonelocation datardquo in Proceedings of the IEEE 10th InternationalConference on Data Mining (ICDM rsquo 10) pp 971ndash976 IEEESydney Australia 2010

[19] B Jiang and X Yao ldquoLocation-based services and GIS inperspectiverdquo Computers Environment and Urban Systems vol30 no 6 pp 712ndash725 2006

[20] G Ghinita P Kalnis A Khoshgozaran C Shahabi and K-LTan ldquoPrivate queries in location based services anonymizersare not necessaryrdquo in Proceedings of the ACM SIGMOD Interna-tional Conference on Management of Data (SIGMOD rsquo08) pp121ndash132 ACM 2008

[21] G Ghinita P KalnisM Kantarcioglu and E Bertino ldquoA hybridtechnique for private location-based queries with database

Mobile Information Systems 13

protectionrdquo in Advances in Spatial and Temporal DatabasesN Mamoulis T Seidl T B Pedersen K Torp and I AssentEds vol 5644 of Lecture Notes in Computer Science pp 98ndash116Springer Berlin Germany 2009

[22] L Sweeney ldquo119896-anonymity a model for protecting privacyrdquoInternational Journal of Uncertainty Fuzziness and Knowledge-Based Systems vol 10 no 5 pp 557ndash570 2002

[23] Generation Partnership Project 3GPP T TR 33821 V900(2009-06) 3GPP Rationale and Track of Security Decisionsin Long Term Evolved (LTE) RAN3GPP System ArchitectureEvolution (SAE) (Release 9)

[24] J Bou Abdo J Demerjian and H Chaouchi ldquoSecurity VSQos for LTE authentication and key agreement protocolrdquoInternational Journal of Network Security amp Its Applications vol4 no 5 pp 71ndash82 2012

[25] A Machanavajjhala D Kifer J Gehrke and M Venkitasub-ramaniam ldquoL-diversity privacy beyond k-anonymityrdquo ACMTransactions on Knowledge Discovery from Data vol 1 no 1article 3 2007

[26] N Li T Li and S Venkatasubramanian ldquot-closeness privacybeyond k-anonymity and l-diversityrdquo in Proceedings of the IEEE23rd International Conference on Data Engineering pp 106ndash115Istanbul Turkey April 2007

[27] X Liu K Liu L Guo X Li and Y Fang ldquoA game-theoreticapproach for achieving k-anonymity in Location Based Ser-vicesrdquo in Proceedings of the 32nd IEEE Conference on ComputerCommunications (IEEE INFOCOM rsquo13) pp 2985ndash2993 TurinItaly April 2013

[28] D Rebollo-Monedero J Forne and J Domingo-Ferrer ldquoFromt-closeness-like privacy to postrandomization via informationtheoryrdquo IEEE Transactions on Knowledge and Data Engineeringvol 22 no 11 pp 1623ndash1636 2010

[29] J B Abdo J Demerjian H Chaouchi T Atechian and CBassil ldquoPrivacy using mobile cloud computingrdquo in Proceedingsof the 5th International Conference on Digital Information andCommunication Technology and Its Applications (DICTAP rsquo15)pp 178ndash182 IEEE Beirut Lebanon May 2015

[30] J B Abdo J Demerjian H Chaouchi K Barbar andG PujolleldquoOperator centric mobile cloud architecturerdquo in Proceedings ofthe IEEE Wireless Communications and Networking Conference(WCNC rsquo14) pp 2982ndash2987 IEEE Istanbul Turkey April 2014

[31] Submitting Your Apple to the Store httpsdeveloperApplecomlibraryiosdocumentationIDEsConceptualAppDistri-butionGuideSubmittingYourAppSubmittingYourApphtml

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 8: Research Article Extended Privacy in Crowdsourced Location ...downloads.hindawi.com/journals/misy/2016/7867206.pdf · popular application o ering crowdsourced location-based service

8 Mobile Information Systems

then

119875 (119909119894isin CR) = 119886119875

119894+ 119887 (8)

The adversary even if he has ldquozero prior knowledgerdquocan collect CRs used by the studied application through asilent passive attack These captured CRs help the attackerto estimate the requesting frequency of each user using thefollowing procedure

Let 119909119894119895be a random variable representing the occurrence

of user ldquo119894rdquo in the 119895th collected CR (CR119895) sdot 119909119894119895= 1 if user ldquo119894rdquo

(119909119894) is found in CR

119895and 119909

119894119895= 0 otherwise Considering that

the attacker has captured ldquo119898rdquo CRs the average number ofoccurrences of user ldquo119894rdquo in the collected CRs is

119883119894=

1

119898

119898

sum

119895=1

119909119894119895 (9)

We can estimate the population rate (the real rate of user119894) from the sample rate with a confidence interval as follows

119875 (119909119894isin CR) isin [119883

119894minus 120572119878radic

119898 minus 1

119898 119883119894+ 120572119878radic

119898 minus 1

119898] 997904rArr

119875 (119909119894isin CR)

isin

[119883119894minus 119905120572

119878radic119898 minus 1

119898 119883119894+ 119905120572

119878radic119898 minus 1

119898] if 119898 lt 30

[119883119894minus

120572119878

radic119898 119883119894+

120572119878

radic119898] elsewhere

(10)

ldquo119905rdquo follows Studentrsquos 119905-distribution of ldquo119898 minus 1rdquo degreesof freedom (using Bayesian prediction) and ldquo120572rdquo followsnormal distribution (using central limit theorem) We cannow deduce 119875

119894since

997904rArr 119875 (119909 = requestor119909 isin CR) =119875119909

sum119895isinCR 119875119895

=1

119899 (11)

We have shown in this section that even with ldquozeroprior knowledgerdquo the adversary is able to estimate the userrsquosrequest rate through a passive attack aiming to collect CRsThis attack takes advantage of the fact that not all users havethe same request rate that is each put a different amount oftime utilizing a certain application User request rate escalateswhen being outside of his comfort zone which helps indifferentiating him from other users with lower request rateAs the duration of this attack increases the attackerrsquos accuracyincreases Using the above estimations the adversary is ableto identify whether a user found in a CR is the real requestorwith a probability different than (greater or less than) ldquo1119899rdquowhich results in lower entropy (uncertainty) and thus betterpredictability and this breaches the119870-anonymity constraint

We will show in the coming section the entropy and rateof successful prediction of our proposed attack

6 Simulation Results

In this section we have simulated using a specially craftedC code 50 users within an anonymizerrsquos controlled zone

Rate of successful prediction using frequency attack

0

5

10

15

20

25

30

35

40

()

20 50 7010()

Estimated rate of successful prediction using K-anonymity

Figure 6 Prediction success rate for CR = 5

Rate of successful prediction using frequency attack

0

10

20

30

40

50

()

20 50 7010()

Estimated rate of successful prediction using K-anonymity

Figure 7 Prediction success rate for CR = 4

Each user has either low or high request rates Users with highrequest rate are considered outside their comfort zone andhave 10 times more requests than low rate users The privacylevel (entropy and rate of successful requestor prediction) isstudied for different CR sizes and for sufficient samples Thetraffic factor (high request ratelow request rate = 10) is notproved on any real application but assumed as a particularcase

In Figure 6 we can see the estimated rate of successfulprediction and the rates achieved by the frequency attack for acloaking region of size 5The119909-axis represents the proportionof the population with high request rates while the 119910-axisrepresents the average probability of predicting the requestorsuccessfully

It can be shown in Figure 6 that frequency attack hasincreased the prediction success compared to the estimatedrate by (30-20)20 = 50 when 70 of the populationare outside their comfort zone to (36-20)20 = 60when 20 of the population are outside their comfort zonedepending on the distribution of the requestors (percentageof users outside their comfort zone) In Figure 7 we cansee the estimated rate of successful prediction and the ratesachieved by the frequency attack for a cloaking region of size4 The 119909-axis and 119910-axis are similar to those in Figure 6

Mobile Information Systems 9

Entropy of the samples with CR = 4 using frequency attackEntropy of the samples with CR = 5 using frequency attack

3

32

34

36

38

4

20 50 7010()

Entropy of equiprobable users as estimated by K-anonymity

Figure 8 Entropy for frequency attack (CR = 4 CR = 5) andequiprobable users

It can be shown in Figure 7 that frequency attack hasincreased the prediction success compared to the estimatedrate by (34-25)25 = 30 when 70 of the populationare outside their comfort zone to (44-25)25 = 20when 20 of the population are outside their comfort zonedepending on the distribution of the requestors (percentageof users outside their comfort zone) In Figure 8 we cansee the entropy (uncertainty) estimated by 119870-anonymityThe 119909-axis represents the proportion of the population withhigh request rates while the 119910-axis represents mechanismrsquosentropy

It can be shown in Figure 8 that the entropy (uncertaintyof knowing the requestor) is lower when using frequencyattack for both CR sizes This means that the attacker is ableto estimate the real requestor with a probability greater thanldquo1119899rdquo which leads to a breach in the ldquo119870-anonymityrdquo privacyconstraint

We have shown in this section that our proposed attackis able to breach119870-anonymity with an incremental rate as theduration of this attack increases

Since we are not able to find any privacy notion that iscapable of ensuring elevated user privacy in crowdsourcedLBS we deduce that a solid privacy solution is needed andthis will be proposed in the coming section

7 The Proposed Solution

Ensuring high privacy level in crowdsourced LBS is not trivialin the following environments (attacker models)

(i) Untrusted service provider in most of the cases theuser has no signed contract with the crowdsourcedlocation-based server thus no privacy obligations areheld especially when no auditing is taking place

(ii) Hostile network crowdsourced LBSs have no meansof managing incoming trafficrsquos privacy level otherthan encryption

(iii) Untrusted service provider in hostile network mostof the LBSs are untrusted and connected to the Inter-net (hostile network) but enforce encrypted trafficThis attackermodel which is currently used results intwo attack categories (insider and neighbor) as shownin Figure 3 Figure 9 shows untrusted service providerin a hostile network

The best privacy can be achieved by a trusted serviceprovider when located within a trusted network In this casethere is no need to use LPPMs or anonymization since theattacker is neither able to access the sent queries (neighborattack) nor able to access the log files (insider attack)Any of the LPPMs found in the literature and surveyed inSection 2 generates both processing and traffic overheadthus eliminating the need for these mechanisms withoutaffecting the privacy level is definitely an achievement interms of security and performance

Since crowdsourced LBS users are in reality mobile usersour proposed solution for this privacy issue is based onour Mobile Cloud Computing architecture named OperatorCentric Mobile Cloud Architecture (OCMCA) [30] We aregoing next to describe briefly OCMCA

71 Operator Centric Mobile Cloud Architecture To decreasethe delay cost and power consumption and increase theprivacy mobility and scalability compared to other mobileCloud architectures we proposed to install a ldquoCloud serverrdquowithin the mobile operatorrsquos network as shown in Figure 10Its proposed position leads to the following

(i) Less delay the traffic does not need to pass throughthe Internet to reach the destined server

(ii) Less cost the same data contents could be deliv-ered to various users using 1 multicast channelwhich decreases congestion (scalability) and allowedcheaper pricing models

(iii) Less power consumption userrsquos jobs are computed atthe Cloud server which eliminates the need for in-house execution

(iv) More mobility a user can maintain a session with theCloud server as long as he is connected to the mobilenetwork Our mobile Cloud architecture benefitsfrom the handover mechanisms in place

(v) More scalability the Cloud servers are centralizedand accessible by all the users

(vi) More privacy the Cloud servers are managed by atrusted and secure entity as it will be shown in moredetail in the next subsection

Since users are usually connected with the same operatorfor long durations we recommend the userrsquos CSP to federatesome resources at the ldquoCloud serverrdquo and then offload theuserrsquos applications and environment settings In this caseall computation-intensive processing is implemented within

10 Mobile Information Systems

Serving GWPDN GW

Cellphone (mobile)eNodeB

MME

LBS

Hostile network Untrusted service provider

Internet

Figure 9 Untrusted service provider in hostile network

Serving GW

MBMS GW

BM-SC

PDN GW

Cloud server

Cellphone (mobile)

Smartphone (mobile)

eNodeB

MME

Trusted network Server under trusted monitoring (trusted server)

Figure 10 OCMCA

the mobile operatorrsquos network and the terminal generateddata are offloaded to the local Cloud without the need toaccess the Internet (which decreases the cost per bit ofthe transmitted data) This Cloud should be able to triggerbroadcast messages when needed 3GPP has standardizedmulticast and broadcast packet transmission in UMTS andbroadcast transmission in LTE through a feature namedMultimedia MulticastBroadcast Service (MBMS) which wasdefined in 3GPPrsquos technical specification as follows

(i) ldquoMBMS is a point-to-multipoint service inwhich datais transmitted from a single source entity to multiplerecipientsrdquo [30]

(ii) Physical broadcasting allows network resources to beshared when transmitting the same data to multiplerecipients [30]

(iii) Its architecture ensures efficient usage of radio-network and core-network resources especially inradio interface [30]

MBMS requires the introduction of two nodes to theEvolved Packet Core (EPC) network [30] which are asfollows

(i) MBMS GW which is responsible for the connectionswith content owners Cloud servers in our case

(ii) BM-SC (Broadcast Multicast Service Center) whichprovides a set of functions for MBMS user services

If MBMS is already enabled in a studied mobile operatorthen MBMS GW and BM-SC are already installed Enablingbroadcast in LTE allows the local Cloud located within themobile operatorrsquos premises to transmit data efficiently to theusers of one or more cells

Mobile Information Systems 11

InternetCellphone(mobile)

Smartphone(mobile)

Destination web callee at other operators

normal Cloud server and so fortheNodeB

Cloud server

BM-SC

Smartphone(mobile)

eNodeB

MBMS GW

MME

Serving GW

PDN GW

SGi

Smartphone(mobile)

Smartphone(mobile)

Normal mobile trafficUnicast Cloud trafficMulticast Cloud traffic

Figure 11 Traffic routes in OCMCA

The added nodes (MBMSGWBM-SC andCloud server)are transparent to normal mobile traffic Mobile traffic willpass as follows

(i) Normal mobile traffic eNodeB Serving GW andPacket Data Network GateWay (PDNGW) shown inFigure 11

(ii) Unicast Cloud traffic eNodeB Serving GW ldquoCloudserverrdquo Serving GW and eNodeB shown in Figure 11

(iii) Multicast Cloud traffic eNodeB Serving GW ldquoCloudserverrdquo BM-SC and MBMS GW shown in Figure 11

72 Privacy Guaranteeing Architecture One of the ldquoCloudserverrsquosrdquo capabilities is to offer Software-as-a-Service (SaaS)on behalf of application developers An application developercan request the operator to host his service (application)After reviewing the source code the operator agrees onhosting this service and signs an SLA with the developer

A similar protocol is implemented byApple before addingany application to Apple Store [31]The application developertrusts Apple for not spreading the source code and for billing(charging every downloaded instance)

In our case the LBS provider will delegate his applicationto be provided by the mobile operator on his behalf as shownin Figure 12

As can be seen in Figure 12 the LBS query does notneed to pass in a hostile network which eliminates neighborattacks Since the offered service is under the operatorrsquosmonitoring the provider becomes trusted and insider attacks

become eliminated The application verification and billingmethods suitable for the operator could be used withoutaffecting the proposed solution

Since LPPMs have been proposed to prevent an attackerwho captured LBS queries from identifying the requestor ortracing the users we can consider that LPPMs are not neededanymore in this scenario since attackers are not capable ofcapturing the transmitted requests Based on this discussionwe consider the following

(i) User identity privacy is maintained The operatorhas access to userrsquos real identity (IMSI MSISDNand name) and generates temporary identities (TMSIGUTI etc) to prevent real identity breach Userrsquosidentity privacy is maintained since the operator isconsidered a trusted entity but in all cases it alreadyhas access to this information thus our proposedsolution does not offer any new information that theoperator could benefit from

(ii) User location privacy is maintained The operatoralready has access to any userrsquos location and does notbenefit from additional sensitive information leadingto a transparent and strong location privacy solutionfor the user

8 Conclusion

User privacy in crowdsourced LBS is unacceptable in itscurrent state Available location privacy preserving mecha-nisms result in unnecessary overhead without being able to

12 Mobile Information Systems

Serving GW

MBMS GWBM-SC

PDN GW

Cloud server

Cellphone (mobile)

Smartphone (mobile)

eNodeB

MME

Trusted networkServer under trusted monitoring (trusted

server)

LBS

Delegatedservices

Figure 12 Crowdsourced LBS delegation

satisfy the needed requirementsWe have shown that even thestate-of-the-art approach such as119870-anonymity is not suitableto fulfill the required privacy needs as it is vulnerable toour defined frequency attack Thus a new privacy model isneeded and this is what we have proposed in this paper

The delegation of services into the Cloud server inside theoperatorrsquos trusted network prevents all the attacks proposedby the attacker model shown in Figure 3

Competing Interests

The authors declare that they have no competing interests

References

[1] Google Cloud httpsCloudgooglecom[2] A Bozzon M Brambilla and S Ceri ldquoAnswering search

queries with CrowdSearcherrdquo in Proceedings of the 21st AnnualConference on World Wide Web (WWW rsquo12) pp 1009ndash1018ACM Perth Australia April 2012

[3] B L Bayus ldquoCrowdsourcing new product ideas over timean analysis of the Dell IdeaStorm communityrdquo ManagementScience vol 59 no 1 pp 226ndash244 2013

[4] J Howe Crowdsourcing How the Power of the Crowd Is Drivingthe Future of Business Random House New York NY USA2008

[5] M F Bulut Y S Yilmaz and M Demirbas ldquoCrowdsourcinglocation-based queriesrdquo in Proceedings of the 9th IEEE Interna-tional Conference on Pervasive Computing and CommunicationsWorkshops (PERCOM Workshops rsquo11) pp 513ndash518 SeattleWash USA March 2011

[6] D Horowitz and S D Kamvar ldquoThe anatomy of a large-scalesocial search enginerdquo in Proceedings of the 19th InternationalWorld Wide Web Conference (WWW rsquo10) pp 431ndash440 RaleighNC USA April 2010

[7] G Chatzimilioudis A Konstantinidis C Laoudias and DZeinalipour-Yazti ldquoCrowdsourcing with smartphonesrdquo IEEEInternet Computing vol 16 no 5 pp 36ndash44 2012

[8] Chachacom httpwwwchachacom[9] Foursquarecom httpsfoursquarecom

[10] W S Lasecki RWesley J Nichols A Kulkarni J F Allen and JP Bigham ldquoChorus a crowd-powered conversational assistantrdquoin Proceedings of the 26th Annual ACM Symposium on UserInterface Software and Technology (UIST rsquo13) pp 151ndash162 StAndrews UK October 2013

[11] A Noulas S Scellato C Mascolo and M Pontil ldquoAn empiricalstudy of geographic user activity patterns in foursquarerdquo inProceedings of the International Conference on Weblogs andSocial Media AAAI Barcelona Spain July 2011

[12] How effective drug addiction treatment httpwwwdrugabusegovpublicationsprinciples-drug-addiction-treatment-re-search-based-guide-third-editionfrequently-asked-questionshow-effective-drug-addiction-treatment

[13] Drug Rehab Statistics httpwwwfuturesofpalmbeachcomdrug-rehabstatistics

[14] J Bou Abdo H Chaouchi and J Demerjian ldquoSecurity inemerging 4G networksrdquo in Next-Generation Wireless Technolo-gies 4G and Beyond pp 243ndash272 Springer London UK 2013

[15] Location-Based Service httpenwikipediaorgwikiLocation-based service

[16] M E Nergiz M Atzori and Y Saygin ldquoTowards trajectoryanonymization a generalization-based approachrdquo in Proceed-ings of the SIGSPATIAL ACM GIS International Workshop onSecurity and Privacy in GIS and LBS (SPRINGL rsquo08) pp 52ndash61November 2008

[17] S SteinigerMNeun andA Edwardes Foundations of LocationBased Services Lecture Notes on LBS v 10 2006

[18] D Quercia N Lathia F Calabrese G Di Lorenzo and JCrowcroft ldquoRecommending social events from mobile phonelocation datardquo in Proceedings of the IEEE 10th InternationalConference on Data Mining (ICDM rsquo 10) pp 971ndash976 IEEESydney Australia 2010

[19] B Jiang and X Yao ldquoLocation-based services and GIS inperspectiverdquo Computers Environment and Urban Systems vol30 no 6 pp 712ndash725 2006

[20] G Ghinita P Kalnis A Khoshgozaran C Shahabi and K-LTan ldquoPrivate queries in location based services anonymizersare not necessaryrdquo in Proceedings of the ACM SIGMOD Interna-tional Conference on Management of Data (SIGMOD rsquo08) pp121ndash132 ACM 2008

[21] G Ghinita P KalnisM Kantarcioglu and E Bertino ldquoA hybridtechnique for private location-based queries with database

Mobile Information Systems 13

protectionrdquo in Advances in Spatial and Temporal DatabasesN Mamoulis T Seidl T B Pedersen K Torp and I AssentEds vol 5644 of Lecture Notes in Computer Science pp 98ndash116Springer Berlin Germany 2009

[22] L Sweeney ldquo119896-anonymity a model for protecting privacyrdquoInternational Journal of Uncertainty Fuzziness and Knowledge-Based Systems vol 10 no 5 pp 557ndash570 2002

[23] Generation Partnership Project 3GPP T TR 33821 V900(2009-06) 3GPP Rationale and Track of Security Decisionsin Long Term Evolved (LTE) RAN3GPP System ArchitectureEvolution (SAE) (Release 9)

[24] J Bou Abdo J Demerjian and H Chaouchi ldquoSecurity VSQos for LTE authentication and key agreement protocolrdquoInternational Journal of Network Security amp Its Applications vol4 no 5 pp 71ndash82 2012

[25] A Machanavajjhala D Kifer J Gehrke and M Venkitasub-ramaniam ldquoL-diversity privacy beyond k-anonymityrdquo ACMTransactions on Knowledge Discovery from Data vol 1 no 1article 3 2007

[26] N Li T Li and S Venkatasubramanian ldquot-closeness privacybeyond k-anonymity and l-diversityrdquo in Proceedings of the IEEE23rd International Conference on Data Engineering pp 106ndash115Istanbul Turkey April 2007

[27] X Liu K Liu L Guo X Li and Y Fang ldquoA game-theoreticapproach for achieving k-anonymity in Location Based Ser-vicesrdquo in Proceedings of the 32nd IEEE Conference on ComputerCommunications (IEEE INFOCOM rsquo13) pp 2985ndash2993 TurinItaly April 2013

[28] D Rebollo-Monedero J Forne and J Domingo-Ferrer ldquoFromt-closeness-like privacy to postrandomization via informationtheoryrdquo IEEE Transactions on Knowledge and Data Engineeringvol 22 no 11 pp 1623ndash1636 2010

[29] J B Abdo J Demerjian H Chaouchi T Atechian and CBassil ldquoPrivacy using mobile cloud computingrdquo in Proceedingsof the 5th International Conference on Digital Information andCommunication Technology and Its Applications (DICTAP rsquo15)pp 178ndash182 IEEE Beirut Lebanon May 2015

[30] J B Abdo J Demerjian H Chaouchi K Barbar andG PujolleldquoOperator centric mobile cloud architecturerdquo in Proceedings ofthe IEEE Wireless Communications and Networking Conference(WCNC rsquo14) pp 2982ndash2987 IEEE Istanbul Turkey April 2014

[31] Submitting Your Apple to the Store httpsdeveloperApplecomlibraryiosdocumentationIDEsConceptualAppDistri-butionGuideSubmittingYourAppSubmittingYourApphtml

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 9: Research Article Extended Privacy in Crowdsourced Location ...downloads.hindawi.com/journals/misy/2016/7867206.pdf · popular application o ering crowdsourced location-based service

Mobile Information Systems 9

Entropy of the samples with CR = 4 using frequency attackEntropy of the samples with CR = 5 using frequency attack

3

32

34

36

38

4

20 50 7010()

Entropy of equiprobable users as estimated by K-anonymity

Figure 8 Entropy for frequency attack (CR = 4 CR = 5) andequiprobable users

It can be shown in Figure 7 that frequency attack hasincreased the prediction success compared to the estimatedrate by (34-25)25 = 30 when 70 of the populationare outside their comfort zone to (44-25)25 = 20when 20 of the population are outside their comfort zonedepending on the distribution of the requestors (percentageof users outside their comfort zone) In Figure 8 we cansee the entropy (uncertainty) estimated by 119870-anonymityThe 119909-axis represents the proportion of the population withhigh request rates while the 119910-axis represents mechanismrsquosentropy

It can be shown in Figure 8 that the entropy (uncertaintyof knowing the requestor) is lower when using frequencyattack for both CR sizes This means that the attacker is ableto estimate the real requestor with a probability greater thanldquo1119899rdquo which leads to a breach in the ldquo119870-anonymityrdquo privacyconstraint

We have shown in this section that our proposed attackis able to breach119870-anonymity with an incremental rate as theduration of this attack increases

Since we are not able to find any privacy notion that iscapable of ensuring elevated user privacy in crowdsourcedLBS we deduce that a solid privacy solution is needed andthis will be proposed in the coming section

7 The Proposed Solution

Ensuring high privacy level in crowdsourced LBS is not trivialin the following environments (attacker models)

(i) Untrusted service provider in most of the cases theuser has no signed contract with the crowdsourcedlocation-based server thus no privacy obligations areheld especially when no auditing is taking place

(ii) Hostile network crowdsourced LBSs have no meansof managing incoming trafficrsquos privacy level otherthan encryption

(iii) Untrusted service provider in hostile network mostof the LBSs are untrusted and connected to the Inter-net (hostile network) but enforce encrypted trafficThis attackermodel which is currently used results intwo attack categories (insider and neighbor) as shownin Figure 3 Figure 9 shows untrusted service providerin a hostile network

The best privacy can be achieved by a trusted serviceprovider when located within a trusted network In this casethere is no need to use LPPMs or anonymization since theattacker is neither able to access the sent queries (neighborattack) nor able to access the log files (insider attack)Any of the LPPMs found in the literature and surveyed inSection 2 generates both processing and traffic overheadthus eliminating the need for these mechanisms withoutaffecting the privacy level is definitely an achievement interms of security and performance

Since crowdsourced LBS users are in reality mobile usersour proposed solution for this privacy issue is based onour Mobile Cloud Computing architecture named OperatorCentric Mobile Cloud Architecture (OCMCA) [30] We aregoing next to describe briefly OCMCA

71 Operator Centric Mobile Cloud Architecture To decreasethe delay cost and power consumption and increase theprivacy mobility and scalability compared to other mobileCloud architectures we proposed to install a ldquoCloud serverrdquowithin the mobile operatorrsquos network as shown in Figure 10Its proposed position leads to the following

(i) Less delay the traffic does not need to pass throughthe Internet to reach the destined server

(ii) Less cost the same data contents could be deliv-ered to various users using 1 multicast channelwhich decreases congestion (scalability) and allowedcheaper pricing models

(iii) Less power consumption userrsquos jobs are computed atthe Cloud server which eliminates the need for in-house execution

(iv) More mobility a user can maintain a session with theCloud server as long as he is connected to the mobilenetwork Our mobile Cloud architecture benefitsfrom the handover mechanisms in place

(v) More scalability the Cloud servers are centralizedand accessible by all the users

(vi) More privacy the Cloud servers are managed by atrusted and secure entity as it will be shown in moredetail in the next subsection

Since users are usually connected with the same operatorfor long durations we recommend the userrsquos CSP to federatesome resources at the ldquoCloud serverrdquo and then offload theuserrsquos applications and environment settings In this caseall computation-intensive processing is implemented within

10 Mobile Information Systems

Serving GWPDN GW

Cellphone (mobile)eNodeB

MME

LBS

Hostile network Untrusted service provider

Internet

Figure 9 Untrusted service provider in hostile network

Serving GW

MBMS GW

BM-SC

PDN GW

Cloud server

Cellphone (mobile)

Smartphone (mobile)

eNodeB

MME

Trusted network Server under trusted monitoring (trusted server)

Figure 10 OCMCA

the mobile operatorrsquos network and the terminal generateddata are offloaded to the local Cloud without the need toaccess the Internet (which decreases the cost per bit ofthe transmitted data) This Cloud should be able to triggerbroadcast messages when needed 3GPP has standardizedmulticast and broadcast packet transmission in UMTS andbroadcast transmission in LTE through a feature namedMultimedia MulticastBroadcast Service (MBMS) which wasdefined in 3GPPrsquos technical specification as follows

(i) ldquoMBMS is a point-to-multipoint service inwhich datais transmitted from a single source entity to multiplerecipientsrdquo [30]

(ii) Physical broadcasting allows network resources to beshared when transmitting the same data to multiplerecipients [30]

(iii) Its architecture ensures efficient usage of radio-network and core-network resources especially inradio interface [30]

MBMS requires the introduction of two nodes to theEvolved Packet Core (EPC) network [30] which are asfollows

(i) MBMS GW which is responsible for the connectionswith content owners Cloud servers in our case

(ii) BM-SC (Broadcast Multicast Service Center) whichprovides a set of functions for MBMS user services

If MBMS is already enabled in a studied mobile operatorthen MBMS GW and BM-SC are already installed Enablingbroadcast in LTE allows the local Cloud located within themobile operatorrsquos premises to transmit data efficiently to theusers of one or more cells

Mobile Information Systems 11

InternetCellphone(mobile)

Smartphone(mobile)

Destination web callee at other operators

normal Cloud server and so fortheNodeB

Cloud server

BM-SC

Smartphone(mobile)

eNodeB

MBMS GW

MME

Serving GW

PDN GW

SGi

Smartphone(mobile)

Smartphone(mobile)

Normal mobile trafficUnicast Cloud trafficMulticast Cloud traffic

Figure 11 Traffic routes in OCMCA

The added nodes (MBMSGWBM-SC andCloud server)are transparent to normal mobile traffic Mobile traffic willpass as follows

(i) Normal mobile traffic eNodeB Serving GW andPacket Data Network GateWay (PDNGW) shown inFigure 11

(ii) Unicast Cloud traffic eNodeB Serving GW ldquoCloudserverrdquo Serving GW and eNodeB shown in Figure 11

(iii) Multicast Cloud traffic eNodeB Serving GW ldquoCloudserverrdquo BM-SC and MBMS GW shown in Figure 11

72 Privacy Guaranteeing Architecture One of the ldquoCloudserverrsquosrdquo capabilities is to offer Software-as-a-Service (SaaS)on behalf of application developers An application developercan request the operator to host his service (application)After reviewing the source code the operator agrees onhosting this service and signs an SLA with the developer

A similar protocol is implemented byApple before addingany application to Apple Store [31]The application developertrusts Apple for not spreading the source code and for billing(charging every downloaded instance)

In our case the LBS provider will delegate his applicationto be provided by the mobile operator on his behalf as shownin Figure 12

As can be seen in Figure 12 the LBS query does notneed to pass in a hostile network which eliminates neighborattacks Since the offered service is under the operatorrsquosmonitoring the provider becomes trusted and insider attacks

become eliminated The application verification and billingmethods suitable for the operator could be used withoutaffecting the proposed solution

Since LPPMs have been proposed to prevent an attackerwho captured LBS queries from identifying the requestor ortracing the users we can consider that LPPMs are not neededanymore in this scenario since attackers are not capable ofcapturing the transmitted requests Based on this discussionwe consider the following

(i) User identity privacy is maintained The operatorhas access to userrsquos real identity (IMSI MSISDNand name) and generates temporary identities (TMSIGUTI etc) to prevent real identity breach Userrsquosidentity privacy is maintained since the operator isconsidered a trusted entity but in all cases it alreadyhas access to this information thus our proposedsolution does not offer any new information that theoperator could benefit from

(ii) User location privacy is maintained The operatoralready has access to any userrsquos location and does notbenefit from additional sensitive information leadingto a transparent and strong location privacy solutionfor the user

8 Conclusion

User privacy in crowdsourced LBS is unacceptable in itscurrent state Available location privacy preserving mecha-nisms result in unnecessary overhead without being able to

12 Mobile Information Systems

Serving GW

MBMS GWBM-SC

PDN GW

Cloud server

Cellphone (mobile)

Smartphone (mobile)

eNodeB

MME

Trusted networkServer under trusted monitoring (trusted

server)

LBS

Delegatedservices

Figure 12 Crowdsourced LBS delegation

satisfy the needed requirementsWe have shown that even thestate-of-the-art approach such as119870-anonymity is not suitableto fulfill the required privacy needs as it is vulnerable toour defined frequency attack Thus a new privacy model isneeded and this is what we have proposed in this paper

The delegation of services into the Cloud server inside theoperatorrsquos trusted network prevents all the attacks proposedby the attacker model shown in Figure 3

Competing Interests

The authors declare that they have no competing interests

References

[1] Google Cloud httpsCloudgooglecom[2] A Bozzon M Brambilla and S Ceri ldquoAnswering search

queries with CrowdSearcherrdquo in Proceedings of the 21st AnnualConference on World Wide Web (WWW rsquo12) pp 1009ndash1018ACM Perth Australia April 2012

[3] B L Bayus ldquoCrowdsourcing new product ideas over timean analysis of the Dell IdeaStorm communityrdquo ManagementScience vol 59 no 1 pp 226ndash244 2013

[4] J Howe Crowdsourcing How the Power of the Crowd Is Drivingthe Future of Business Random House New York NY USA2008

[5] M F Bulut Y S Yilmaz and M Demirbas ldquoCrowdsourcinglocation-based queriesrdquo in Proceedings of the 9th IEEE Interna-tional Conference on Pervasive Computing and CommunicationsWorkshops (PERCOM Workshops rsquo11) pp 513ndash518 SeattleWash USA March 2011

[6] D Horowitz and S D Kamvar ldquoThe anatomy of a large-scalesocial search enginerdquo in Proceedings of the 19th InternationalWorld Wide Web Conference (WWW rsquo10) pp 431ndash440 RaleighNC USA April 2010

[7] G Chatzimilioudis A Konstantinidis C Laoudias and DZeinalipour-Yazti ldquoCrowdsourcing with smartphonesrdquo IEEEInternet Computing vol 16 no 5 pp 36ndash44 2012

[8] Chachacom httpwwwchachacom[9] Foursquarecom httpsfoursquarecom

[10] W S Lasecki RWesley J Nichols A Kulkarni J F Allen and JP Bigham ldquoChorus a crowd-powered conversational assistantrdquoin Proceedings of the 26th Annual ACM Symposium on UserInterface Software and Technology (UIST rsquo13) pp 151ndash162 StAndrews UK October 2013

[11] A Noulas S Scellato C Mascolo and M Pontil ldquoAn empiricalstudy of geographic user activity patterns in foursquarerdquo inProceedings of the International Conference on Weblogs andSocial Media AAAI Barcelona Spain July 2011

[12] How effective drug addiction treatment httpwwwdrugabusegovpublicationsprinciples-drug-addiction-treatment-re-search-based-guide-third-editionfrequently-asked-questionshow-effective-drug-addiction-treatment

[13] Drug Rehab Statistics httpwwwfuturesofpalmbeachcomdrug-rehabstatistics

[14] J Bou Abdo H Chaouchi and J Demerjian ldquoSecurity inemerging 4G networksrdquo in Next-Generation Wireless Technolo-gies 4G and Beyond pp 243ndash272 Springer London UK 2013

[15] Location-Based Service httpenwikipediaorgwikiLocation-based service

[16] M E Nergiz M Atzori and Y Saygin ldquoTowards trajectoryanonymization a generalization-based approachrdquo in Proceed-ings of the SIGSPATIAL ACM GIS International Workshop onSecurity and Privacy in GIS and LBS (SPRINGL rsquo08) pp 52ndash61November 2008

[17] S SteinigerMNeun andA Edwardes Foundations of LocationBased Services Lecture Notes on LBS v 10 2006

[18] D Quercia N Lathia F Calabrese G Di Lorenzo and JCrowcroft ldquoRecommending social events from mobile phonelocation datardquo in Proceedings of the IEEE 10th InternationalConference on Data Mining (ICDM rsquo 10) pp 971ndash976 IEEESydney Australia 2010

[19] B Jiang and X Yao ldquoLocation-based services and GIS inperspectiverdquo Computers Environment and Urban Systems vol30 no 6 pp 712ndash725 2006

[20] G Ghinita P Kalnis A Khoshgozaran C Shahabi and K-LTan ldquoPrivate queries in location based services anonymizersare not necessaryrdquo in Proceedings of the ACM SIGMOD Interna-tional Conference on Management of Data (SIGMOD rsquo08) pp121ndash132 ACM 2008

[21] G Ghinita P KalnisM Kantarcioglu and E Bertino ldquoA hybridtechnique for private location-based queries with database

Mobile Information Systems 13

protectionrdquo in Advances in Spatial and Temporal DatabasesN Mamoulis T Seidl T B Pedersen K Torp and I AssentEds vol 5644 of Lecture Notes in Computer Science pp 98ndash116Springer Berlin Germany 2009

[22] L Sweeney ldquo119896-anonymity a model for protecting privacyrdquoInternational Journal of Uncertainty Fuzziness and Knowledge-Based Systems vol 10 no 5 pp 557ndash570 2002

[23] Generation Partnership Project 3GPP T TR 33821 V900(2009-06) 3GPP Rationale and Track of Security Decisionsin Long Term Evolved (LTE) RAN3GPP System ArchitectureEvolution (SAE) (Release 9)

[24] J Bou Abdo J Demerjian and H Chaouchi ldquoSecurity VSQos for LTE authentication and key agreement protocolrdquoInternational Journal of Network Security amp Its Applications vol4 no 5 pp 71ndash82 2012

[25] A Machanavajjhala D Kifer J Gehrke and M Venkitasub-ramaniam ldquoL-diversity privacy beyond k-anonymityrdquo ACMTransactions on Knowledge Discovery from Data vol 1 no 1article 3 2007

[26] N Li T Li and S Venkatasubramanian ldquot-closeness privacybeyond k-anonymity and l-diversityrdquo in Proceedings of the IEEE23rd International Conference on Data Engineering pp 106ndash115Istanbul Turkey April 2007

[27] X Liu K Liu L Guo X Li and Y Fang ldquoA game-theoreticapproach for achieving k-anonymity in Location Based Ser-vicesrdquo in Proceedings of the 32nd IEEE Conference on ComputerCommunications (IEEE INFOCOM rsquo13) pp 2985ndash2993 TurinItaly April 2013

[28] D Rebollo-Monedero J Forne and J Domingo-Ferrer ldquoFromt-closeness-like privacy to postrandomization via informationtheoryrdquo IEEE Transactions on Knowledge and Data Engineeringvol 22 no 11 pp 1623ndash1636 2010

[29] J B Abdo J Demerjian H Chaouchi T Atechian and CBassil ldquoPrivacy using mobile cloud computingrdquo in Proceedingsof the 5th International Conference on Digital Information andCommunication Technology and Its Applications (DICTAP rsquo15)pp 178ndash182 IEEE Beirut Lebanon May 2015

[30] J B Abdo J Demerjian H Chaouchi K Barbar andG PujolleldquoOperator centric mobile cloud architecturerdquo in Proceedings ofthe IEEE Wireless Communications and Networking Conference(WCNC rsquo14) pp 2982ndash2987 IEEE Istanbul Turkey April 2014

[31] Submitting Your Apple to the Store httpsdeveloperApplecomlibraryiosdocumentationIDEsConceptualAppDistri-butionGuideSubmittingYourAppSubmittingYourApphtml

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 10: Research Article Extended Privacy in Crowdsourced Location ...downloads.hindawi.com/journals/misy/2016/7867206.pdf · popular application o ering crowdsourced location-based service

10 Mobile Information Systems

Serving GWPDN GW

Cellphone (mobile)eNodeB

MME

LBS

Hostile network Untrusted service provider

Internet

Figure 9 Untrusted service provider in hostile network

Serving GW

MBMS GW

BM-SC

PDN GW

Cloud server

Cellphone (mobile)

Smartphone (mobile)

eNodeB

MME

Trusted network Server under trusted monitoring (trusted server)

Figure 10 OCMCA

the mobile operatorrsquos network and the terminal generateddata are offloaded to the local Cloud without the need toaccess the Internet (which decreases the cost per bit ofthe transmitted data) This Cloud should be able to triggerbroadcast messages when needed 3GPP has standardizedmulticast and broadcast packet transmission in UMTS andbroadcast transmission in LTE through a feature namedMultimedia MulticastBroadcast Service (MBMS) which wasdefined in 3GPPrsquos technical specification as follows

(i) ldquoMBMS is a point-to-multipoint service inwhich datais transmitted from a single source entity to multiplerecipientsrdquo [30]

(ii) Physical broadcasting allows network resources to beshared when transmitting the same data to multiplerecipients [30]

(iii) Its architecture ensures efficient usage of radio-network and core-network resources especially inradio interface [30]

MBMS requires the introduction of two nodes to theEvolved Packet Core (EPC) network [30] which are asfollows

(i) MBMS GW which is responsible for the connectionswith content owners Cloud servers in our case

(ii) BM-SC (Broadcast Multicast Service Center) whichprovides a set of functions for MBMS user services

If MBMS is already enabled in a studied mobile operatorthen MBMS GW and BM-SC are already installed Enablingbroadcast in LTE allows the local Cloud located within themobile operatorrsquos premises to transmit data efficiently to theusers of one or more cells

Mobile Information Systems 11

InternetCellphone(mobile)

Smartphone(mobile)

Destination web callee at other operators

normal Cloud server and so fortheNodeB

Cloud server

BM-SC

Smartphone(mobile)

eNodeB

MBMS GW

MME

Serving GW

PDN GW

SGi

Smartphone(mobile)

Smartphone(mobile)

Normal mobile trafficUnicast Cloud trafficMulticast Cloud traffic

Figure 11 Traffic routes in OCMCA

The added nodes (MBMSGWBM-SC andCloud server)are transparent to normal mobile traffic Mobile traffic willpass as follows

(i) Normal mobile traffic eNodeB Serving GW andPacket Data Network GateWay (PDNGW) shown inFigure 11

(ii) Unicast Cloud traffic eNodeB Serving GW ldquoCloudserverrdquo Serving GW and eNodeB shown in Figure 11

(iii) Multicast Cloud traffic eNodeB Serving GW ldquoCloudserverrdquo BM-SC and MBMS GW shown in Figure 11

72 Privacy Guaranteeing Architecture One of the ldquoCloudserverrsquosrdquo capabilities is to offer Software-as-a-Service (SaaS)on behalf of application developers An application developercan request the operator to host his service (application)After reviewing the source code the operator agrees onhosting this service and signs an SLA with the developer

A similar protocol is implemented byApple before addingany application to Apple Store [31]The application developertrusts Apple for not spreading the source code and for billing(charging every downloaded instance)

In our case the LBS provider will delegate his applicationto be provided by the mobile operator on his behalf as shownin Figure 12

As can be seen in Figure 12 the LBS query does notneed to pass in a hostile network which eliminates neighborattacks Since the offered service is under the operatorrsquosmonitoring the provider becomes trusted and insider attacks

become eliminated The application verification and billingmethods suitable for the operator could be used withoutaffecting the proposed solution

Since LPPMs have been proposed to prevent an attackerwho captured LBS queries from identifying the requestor ortracing the users we can consider that LPPMs are not neededanymore in this scenario since attackers are not capable ofcapturing the transmitted requests Based on this discussionwe consider the following

(i) User identity privacy is maintained The operatorhas access to userrsquos real identity (IMSI MSISDNand name) and generates temporary identities (TMSIGUTI etc) to prevent real identity breach Userrsquosidentity privacy is maintained since the operator isconsidered a trusted entity but in all cases it alreadyhas access to this information thus our proposedsolution does not offer any new information that theoperator could benefit from

(ii) User location privacy is maintained The operatoralready has access to any userrsquos location and does notbenefit from additional sensitive information leadingto a transparent and strong location privacy solutionfor the user

8 Conclusion

User privacy in crowdsourced LBS is unacceptable in itscurrent state Available location privacy preserving mecha-nisms result in unnecessary overhead without being able to

12 Mobile Information Systems

Serving GW

MBMS GWBM-SC

PDN GW

Cloud server

Cellphone (mobile)

Smartphone (mobile)

eNodeB

MME

Trusted networkServer under trusted monitoring (trusted

server)

LBS

Delegatedservices

Figure 12 Crowdsourced LBS delegation

satisfy the needed requirementsWe have shown that even thestate-of-the-art approach such as119870-anonymity is not suitableto fulfill the required privacy needs as it is vulnerable toour defined frequency attack Thus a new privacy model isneeded and this is what we have proposed in this paper

The delegation of services into the Cloud server inside theoperatorrsquos trusted network prevents all the attacks proposedby the attacker model shown in Figure 3

Competing Interests

The authors declare that they have no competing interests

References

[1] Google Cloud httpsCloudgooglecom[2] A Bozzon M Brambilla and S Ceri ldquoAnswering search

queries with CrowdSearcherrdquo in Proceedings of the 21st AnnualConference on World Wide Web (WWW rsquo12) pp 1009ndash1018ACM Perth Australia April 2012

[3] B L Bayus ldquoCrowdsourcing new product ideas over timean analysis of the Dell IdeaStorm communityrdquo ManagementScience vol 59 no 1 pp 226ndash244 2013

[4] J Howe Crowdsourcing How the Power of the Crowd Is Drivingthe Future of Business Random House New York NY USA2008

[5] M F Bulut Y S Yilmaz and M Demirbas ldquoCrowdsourcinglocation-based queriesrdquo in Proceedings of the 9th IEEE Interna-tional Conference on Pervasive Computing and CommunicationsWorkshops (PERCOM Workshops rsquo11) pp 513ndash518 SeattleWash USA March 2011

[6] D Horowitz and S D Kamvar ldquoThe anatomy of a large-scalesocial search enginerdquo in Proceedings of the 19th InternationalWorld Wide Web Conference (WWW rsquo10) pp 431ndash440 RaleighNC USA April 2010

[7] G Chatzimilioudis A Konstantinidis C Laoudias and DZeinalipour-Yazti ldquoCrowdsourcing with smartphonesrdquo IEEEInternet Computing vol 16 no 5 pp 36ndash44 2012

[8] Chachacom httpwwwchachacom[9] Foursquarecom httpsfoursquarecom

[10] W S Lasecki RWesley J Nichols A Kulkarni J F Allen and JP Bigham ldquoChorus a crowd-powered conversational assistantrdquoin Proceedings of the 26th Annual ACM Symposium on UserInterface Software and Technology (UIST rsquo13) pp 151ndash162 StAndrews UK October 2013

[11] A Noulas S Scellato C Mascolo and M Pontil ldquoAn empiricalstudy of geographic user activity patterns in foursquarerdquo inProceedings of the International Conference on Weblogs andSocial Media AAAI Barcelona Spain July 2011

[12] How effective drug addiction treatment httpwwwdrugabusegovpublicationsprinciples-drug-addiction-treatment-re-search-based-guide-third-editionfrequently-asked-questionshow-effective-drug-addiction-treatment

[13] Drug Rehab Statistics httpwwwfuturesofpalmbeachcomdrug-rehabstatistics

[14] J Bou Abdo H Chaouchi and J Demerjian ldquoSecurity inemerging 4G networksrdquo in Next-Generation Wireless Technolo-gies 4G and Beyond pp 243ndash272 Springer London UK 2013

[15] Location-Based Service httpenwikipediaorgwikiLocation-based service

[16] M E Nergiz M Atzori and Y Saygin ldquoTowards trajectoryanonymization a generalization-based approachrdquo in Proceed-ings of the SIGSPATIAL ACM GIS International Workshop onSecurity and Privacy in GIS and LBS (SPRINGL rsquo08) pp 52ndash61November 2008

[17] S SteinigerMNeun andA Edwardes Foundations of LocationBased Services Lecture Notes on LBS v 10 2006

[18] D Quercia N Lathia F Calabrese G Di Lorenzo and JCrowcroft ldquoRecommending social events from mobile phonelocation datardquo in Proceedings of the IEEE 10th InternationalConference on Data Mining (ICDM rsquo 10) pp 971ndash976 IEEESydney Australia 2010

[19] B Jiang and X Yao ldquoLocation-based services and GIS inperspectiverdquo Computers Environment and Urban Systems vol30 no 6 pp 712ndash725 2006

[20] G Ghinita P Kalnis A Khoshgozaran C Shahabi and K-LTan ldquoPrivate queries in location based services anonymizersare not necessaryrdquo in Proceedings of the ACM SIGMOD Interna-tional Conference on Management of Data (SIGMOD rsquo08) pp121ndash132 ACM 2008

[21] G Ghinita P KalnisM Kantarcioglu and E Bertino ldquoA hybridtechnique for private location-based queries with database

Mobile Information Systems 13

protectionrdquo in Advances in Spatial and Temporal DatabasesN Mamoulis T Seidl T B Pedersen K Torp and I AssentEds vol 5644 of Lecture Notes in Computer Science pp 98ndash116Springer Berlin Germany 2009

[22] L Sweeney ldquo119896-anonymity a model for protecting privacyrdquoInternational Journal of Uncertainty Fuzziness and Knowledge-Based Systems vol 10 no 5 pp 557ndash570 2002

[23] Generation Partnership Project 3GPP T TR 33821 V900(2009-06) 3GPP Rationale and Track of Security Decisionsin Long Term Evolved (LTE) RAN3GPP System ArchitectureEvolution (SAE) (Release 9)

[24] J Bou Abdo J Demerjian and H Chaouchi ldquoSecurity VSQos for LTE authentication and key agreement protocolrdquoInternational Journal of Network Security amp Its Applications vol4 no 5 pp 71ndash82 2012

[25] A Machanavajjhala D Kifer J Gehrke and M Venkitasub-ramaniam ldquoL-diversity privacy beyond k-anonymityrdquo ACMTransactions on Knowledge Discovery from Data vol 1 no 1article 3 2007

[26] N Li T Li and S Venkatasubramanian ldquot-closeness privacybeyond k-anonymity and l-diversityrdquo in Proceedings of the IEEE23rd International Conference on Data Engineering pp 106ndash115Istanbul Turkey April 2007

[27] X Liu K Liu L Guo X Li and Y Fang ldquoA game-theoreticapproach for achieving k-anonymity in Location Based Ser-vicesrdquo in Proceedings of the 32nd IEEE Conference on ComputerCommunications (IEEE INFOCOM rsquo13) pp 2985ndash2993 TurinItaly April 2013

[28] D Rebollo-Monedero J Forne and J Domingo-Ferrer ldquoFromt-closeness-like privacy to postrandomization via informationtheoryrdquo IEEE Transactions on Knowledge and Data Engineeringvol 22 no 11 pp 1623ndash1636 2010

[29] J B Abdo J Demerjian H Chaouchi T Atechian and CBassil ldquoPrivacy using mobile cloud computingrdquo in Proceedingsof the 5th International Conference on Digital Information andCommunication Technology and Its Applications (DICTAP rsquo15)pp 178ndash182 IEEE Beirut Lebanon May 2015

[30] J B Abdo J Demerjian H Chaouchi K Barbar andG PujolleldquoOperator centric mobile cloud architecturerdquo in Proceedings ofthe IEEE Wireless Communications and Networking Conference(WCNC rsquo14) pp 2982ndash2987 IEEE Istanbul Turkey April 2014

[31] Submitting Your Apple to the Store httpsdeveloperApplecomlibraryiosdocumentationIDEsConceptualAppDistri-butionGuideSubmittingYourAppSubmittingYourApphtml

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 11: Research Article Extended Privacy in Crowdsourced Location ...downloads.hindawi.com/journals/misy/2016/7867206.pdf · popular application o ering crowdsourced location-based service

Mobile Information Systems 11

InternetCellphone(mobile)

Smartphone(mobile)

Destination web callee at other operators

normal Cloud server and so fortheNodeB

Cloud server

BM-SC

Smartphone(mobile)

eNodeB

MBMS GW

MME

Serving GW

PDN GW

SGi

Smartphone(mobile)

Smartphone(mobile)

Normal mobile trafficUnicast Cloud trafficMulticast Cloud traffic

Figure 11 Traffic routes in OCMCA

The added nodes (MBMSGWBM-SC andCloud server)are transparent to normal mobile traffic Mobile traffic willpass as follows

(i) Normal mobile traffic eNodeB Serving GW andPacket Data Network GateWay (PDNGW) shown inFigure 11

(ii) Unicast Cloud traffic eNodeB Serving GW ldquoCloudserverrdquo Serving GW and eNodeB shown in Figure 11

(iii) Multicast Cloud traffic eNodeB Serving GW ldquoCloudserverrdquo BM-SC and MBMS GW shown in Figure 11

72 Privacy Guaranteeing Architecture One of the ldquoCloudserverrsquosrdquo capabilities is to offer Software-as-a-Service (SaaS)on behalf of application developers An application developercan request the operator to host his service (application)After reviewing the source code the operator agrees onhosting this service and signs an SLA with the developer

A similar protocol is implemented byApple before addingany application to Apple Store [31]The application developertrusts Apple for not spreading the source code and for billing(charging every downloaded instance)

In our case the LBS provider will delegate his applicationto be provided by the mobile operator on his behalf as shownin Figure 12

As can be seen in Figure 12 the LBS query does notneed to pass in a hostile network which eliminates neighborattacks Since the offered service is under the operatorrsquosmonitoring the provider becomes trusted and insider attacks

become eliminated The application verification and billingmethods suitable for the operator could be used withoutaffecting the proposed solution

Since LPPMs have been proposed to prevent an attackerwho captured LBS queries from identifying the requestor ortracing the users we can consider that LPPMs are not neededanymore in this scenario since attackers are not capable ofcapturing the transmitted requests Based on this discussionwe consider the following

(i) User identity privacy is maintained The operatorhas access to userrsquos real identity (IMSI MSISDNand name) and generates temporary identities (TMSIGUTI etc) to prevent real identity breach Userrsquosidentity privacy is maintained since the operator isconsidered a trusted entity but in all cases it alreadyhas access to this information thus our proposedsolution does not offer any new information that theoperator could benefit from

(ii) User location privacy is maintained The operatoralready has access to any userrsquos location and does notbenefit from additional sensitive information leadingto a transparent and strong location privacy solutionfor the user

8 Conclusion

User privacy in crowdsourced LBS is unacceptable in itscurrent state Available location privacy preserving mecha-nisms result in unnecessary overhead without being able to

12 Mobile Information Systems

Serving GW

MBMS GWBM-SC

PDN GW

Cloud server

Cellphone (mobile)

Smartphone (mobile)

eNodeB

MME

Trusted networkServer under trusted monitoring (trusted

server)

LBS

Delegatedservices

Figure 12 Crowdsourced LBS delegation

satisfy the needed requirementsWe have shown that even thestate-of-the-art approach such as119870-anonymity is not suitableto fulfill the required privacy needs as it is vulnerable toour defined frequency attack Thus a new privacy model isneeded and this is what we have proposed in this paper

The delegation of services into the Cloud server inside theoperatorrsquos trusted network prevents all the attacks proposedby the attacker model shown in Figure 3

Competing Interests

The authors declare that they have no competing interests

References

[1] Google Cloud httpsCloudgooglecom[2] A Bozzon M Brambilla and S Ceri ldquoAnswering search

queries with CrowdSearcherrdquo in Proceedings of the 21st AnnualConference on World Wide Web (WWW rsquo12) pp 1009ndash1018ACM Perth Australia April 2012

[3] B L Bayus ldquoCrowdsourcing new product ideas over timean analysis of the Dell IdeaStorm communityrdquo ManagementScience vol 59 no 1 pp 226ndash244 2013

[4] J Howe Crowdsourcing How the Power of the Crowd Is Drivingthe Future of Business Random House New York NY USA2008

[5] M F Bulut Y S Yilmaz and M Demirbas ldquoCrowdsourcinglocation-based queriesrdquo in Proceedings of the 9th IEEE Interna-tional Conference on Pervasive Computing and CommunicationsWorkshops (PERCOM Workshops rsquo11) pp 513ndash518 SeattleWash USA March 2011

[6] D Horowitz and S D Kamvar ldquoThe anatomy of a large-scalesocial search enginerdquo in Proceedings of the 19th InternationalWorld Wide Web Conference (WWW rsquo10) pp 431ndash440 RaleighNC USA April 2010

[7] G Chatzimilioudis A Konstantinidis C Laoudias and DZeinalipour-Yazti ldquoCrowdsourcing with smartphonesrdquo IEEEInternet Computing vol 16 no 5 pp 36ndash44 2012

[8] Chachacom httpwwwchachacom[9] Foursquarecom httpsfoursquarecom

[10] W S Lasecki RWesley J Nichols A Kulkarni J F Allen and JP Bigham ldquoChorus a crowd-powered conversational assistantrdquoin Proceedings of the 26th Annual ACM Symposium on UserInterface Software and Technology (UIST rsquo13) pp 151ndash162 StAndrews UK October 2013

[11] A Noulas S Scellato C Mascolo and M Pontil ldquoAn empiricalstudy of geographic user activity patterns in foursquarerdquo inProceedings of the International Conference on Weblogs andSocial Media AAAI Barcelona Spain July 2011

[12] How effective drug addiction treatment httpwwwdrugabusegovpublicationsprinciples-drug-addiction-treatment-re-search-based-guide-third-editionfrequently-asked-questionshow-effective-drug-addiction-treatment

[13] Drug Rehab Statistics httpwwwfuturesofpalmbeachcomdrug-rehabstatistics

[14] J Bou Abdo H Chaouchi and J Demerjian ldquoSecurity inemerging 4G networksrdquo in Next-Generation Wireless Technolo-gies 4G and Beyond pp 243ndash272 Springer London UK 2013

[15] Location-Based Service httpenwikipediaorgwikiLocation-based service

[16] M E Nergiz M Atzori and Y Saygin ldquoTowards trajectoryanonymization a generalization-based approachrdquo in Proceed-ings of the SIGSPATIAL ACM GIS International Workshop onSecurity and Privacy in GIS and LBS (SPRINGL rsquo08) pp 52ndash61November 2008

[17] S SteinigerMNeun andA Edwardes Foundations of LocationBased Services Lecture Notes on LBS v 10 2006

[18] D Quercia N Lathia F Calabrese G Di Lorenzo and JCrowcroft ldquoRecommending social events from mobile phonelocation datardquo in Proceedings of the IEEE 10th InternationalConference on Data Mining (ICDM rsquo 10) pp 971ndash976 IEEESydney Australia 2010

[19] B Jiang and X Yao ldquoLocation-based services and GIS inperspectiverdquo Computers Environment and Urban Systems vol30 no 6 pp 712ndash725 2006

[20] G Ghinita P Kalnis A Khoshgozaran C Shahabi and K-LTan ldquoPrivate queries in location based services anonymizersare not necessaryrdquo in Proceedings of the ACM SIGMOD Interna-tional Conference on Management of Data (SIGMOD rsquo08) pp121ndash132 ACM 2008

[21] G Ghinita P KalnisM Kantarcioglu and E Bertino ldquoA hybridtechnique for private location-based queries with database

Mobile Information Systems 13

protectionrdquo in Advances in Spatial and Temporal DatabasesN Mamoulis T Seidl T B Pedersen K Torp and I AssentEds vol 5644 of Lecture Notes in Computer Science pp 98ndash116Springer Berlin Germany 2009

[22] L Sweeney ldquo119896-anonymity a model for protecting privacyrdquoInternational Journal of Uncertainty Fuzziness and Knowledge-Based Systems vol 10 no 5 pp 557ndash570 2002

[23] Generation Partnership Project 3GPP T TR 33821 V900(2009-06) 3GPP Rationale and Track of Security Decisionsin Long Term Evolved (LTE) RAN3GPP System ArchitectureEvolution (SAE) (Release 9)

[24] J Bou Abdo J Demerjian and H Chaouchi ldquoSecurity VSQos for LTE authentication and key agreement protocolrdquoInternational Journal of Network Security amp Its Applications vol4 no 5 pp 71ndash82 2012

[25] A Machanavajjhala D Kifer J Gehrke and M Venkitasub-ramaniam ldquoL-diversity privacy beyond k-anonymityrdquo ACMTransactions on Knowledge Discovery from Data vol 1 no 1article 3 2007

[26] N Li T Li and S Venkatasubramanian ldquot-closeness privacybeyond k-anonymity and l-diversityrdquo in Proceedings of the IEEE23rd International Conference on Data Engineering pp 106ndash115Istanbul Turkey April 2007

[27] X Liu K Liu L Guo X Li and Y Fang ldquoA game-theoreticapproach for achieving k-anonymity in Location Based Ser-vicesrdquo in Proceedings of the 32nd IEEE Conference on ComputerCommunications (IEEE INFOCOM rsquo13) pp 2985ndash2993 TurinItaly April 2013

[28] D Rebollo-Monedero J Forne and J Domingo-Ferrer ldquoFromt-closeness-like privacy to postrandomization via informationtheoryrdquo IEEE Transactions on Knowledge and Data Engineeringvol 22 no 11 pp 1623ndash1636 2010

[29] J B Abdo J Demerjian H Chaouchi T Atechian and CBassil ldquoPrivacy using mobile cloud computingrdquo in Proceedingsof the 5th International Conference on Digital Information andCommunication Technology and Its Applications (DICTAP rsquo15)pp 178ndash182 IEEE Beirut Lebanon May 2015

[30] J B Abdo J Demerjian H Chaouchi K Barbar andG PujolleldquoOperator centric mobile cloud architecturerdquo in Proceedings ofthe IEEE Wireless Communications and Networking Conference(WCNC rsquo14) pp 2982ndash2987 IEEE Istanbul Turkey April 2014

[31] Submitting Your Apple to the Store httpsdeveloperApplecomlibraryiosdocumentationIDEsConceptualAppDistri-butionGuideSubmittingYourAppSubmittingYourApphtml

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 12: Research Article Extended Privacy in Crowdsourced Location ...downloads.hindawi.com/journals/misy/2016/7867206.pdf · popular application o ering crowdsourced location-based service

12 Mobile Information Systems

Serving GW

MBMS GWBM-SC

PDN GW

Cloud server

Cellphone (mobile)

Smartphone (mobile)

eNodeB

MME

Trusted networkServer under trusted monitoring (trusted

server)

LBS

Delegatedservices

Figure 12 Crowdsourced LBS delegation

satisfy the needed requirementsWe have shown that even thestate-of-the-art approach such as119870-anonymity is not suitableto fulfill the required privacy needs as it is vulnerable toour defined frequency attack Thus a new privacy model isneeded and this is what we have proposed in this paper

The delegation of services into the Cloud server inside theoperatorrsquos trusted network prevents all the attacks proposedby the attacker model shown in Figure 3

Competing Interests

The authors declare that they have no competing interests

References

[1] Google Cloud httpsCloudgooglecom[2] A Bozzon M Brambilla and S Ceri ldquoAnswering search

queries with CrowdSearcherrdquo in Proceedings of the 21st AnnualConference on World Wide Web (WWW rsquo12) pp 1009ndash1018ACM Perth Australia April 2012

[3] B L Bayus ldquoCrowdsourcing new product ideas over timean analysis of the Dell IdeaStorm communityrdquo ManagementScience vol 59 no 1 pp 226ndash244 2013

[4] J Howe Crowdsourcing How the Power of the Crowd Is Drivingthe Future of Business Random House New York NY USA2008

[5] M F Bulut Y S Yilmaz and M Demirbas ldquoCrowdsourcinglocation-based queriesrdquo in Proceedings of the 9th IEEE Interna-tional Conference on Pervasive Computing and CommunicationsWorkshops (PERCOM Workshops rsquo11) pp 513ndash518 SeattleWash USA March 2011

[6] D Horowitz and S D Kamvar ldquoThe anatomy of a large-scalesocial search enginerdquo in Proceedings of the 19th InternationalWorld Wide Web Conference (WWW rsquo10) pp 431ndash440 RaleighNC USA April 2010

[7] G Chatzimilioudis A Konstantinidis C Laoudias and DZeinalipour-Yazti ldquoCrowdsourcing with smartphonesrdquo IEEEInternet Computing vol 16 no 5 pp 36ndash44 2012

[8] Chachacom httpwwwchachacom[9] Foursquarecom httpsfoursquarecom

[10] W S Lasecki RWesley J Nichols A Kulkarni J F Allen and JP Bigham ldquoChorus a crowd-powered conversational assistantrdquoin Proceedings of the 26th Annual ACM Symposium on UserInterface Software and Technology (UIST rsquo13) pp 151ndash162 StAndrews UK October 2013

[11] A Noulas S Scellato C Mascolo and M Pontil ldquoAn empiricalstudy of geographic user activity patterns in foursquarerdquo inProceedings of the International Conference on Weblogs andSocial Media AAAI Barcelona Spain July 2011

[12] How effective drug addiction treatment httpwwwdrugabusegovpublicationsprinciples-drug-addiction-treatment-re-search-based-guide-third-editionfrequently-asked-questionshow-effective-drug-addiction-treatment

[13] Drug Rehab Statistics httpwwwfuturesofpalmbeachcomdrug-rehabstatistics

[14] J Bou Abdo H Chaouchi and J Demerjian ldquoSecurity inemerging 4G networksrdquo in Next-Generation Wireless Technolo-gies 4G and Beyond pp 243ndash272 Springer London UK 2013

[15] Location-Based Service httpenwikipediaorgwikiLocation-based service

[16] M E Nergiz M Atzori and Y Saygin ldquoTowards trajectoryanonymization a generalization-based approachrdquo in Proceed-ings of the SIGSPATIAL ACM GIS International Workshop onSecurity and Privacy in GIS and LBS (SPRINGL rsquo08) pp 52ndash61November 2008

[17] S SteinigerMNeun andA Edwardes Foundations of LocationBased Services Lecture Notes on LBS v 10 2006

[18] D Quercia N Lathia F Calabrese G Di Lorenzo and JCrowcroft ldquoRecommending social events from mobile phonelocation datardquo in Proceedings of the IEEE 10th InternationalConference on Data Mining (ICDM rsquo 10) pp 971ndash976 IEEESydney Australia 2010

[19] B Jiang and X Yao ldquoLocation-based services and GIS inperspectiverdquo Computers Environment and Urban Systems vol30 no 6 pp 712ndash725 2006

[20] G Ghinita P Kalnis A Khoshgozaran C Shahabi and K-LTan ldquoPrivate queries in location based services anonymizersare not necessaryrdquo in Proceedings of the ACM SIGMOD Interna-tional Conference on Management of Data (SIGMOD rsquo08) pp121ndash132 ACM 2008

[21] G Ghinita P KalnisM Kantarcioglu and E Bertino ldquoA hybridtechnique for private location-based queries with database

Mobile Information Systems 13

protectionrdquo in Advances in Spatial and Temporal DatabasesN Mamoulis T Seidl T B Pedersen K Torp and I AssentEds vol 5644 of Lecture Notes in Computer Science pp 98ndash116Springer Berlin Germany 2009

[22] L Sweeney ldquo119896-anonymity a model for protecting privacyrdquoInternational Journal of Uncertainty Fuzziness and Knowledge-Based Systems vol 10 no 5 pp 557ndash570 2002

[23] Generation Partnership Project 3GPP T TR 33821 V900(2009-06) 3GPP Rationale and Track of Security Decisionsin Long Term Evolved (LTE) RAN3GPP System ArchitectureEvolution (SAE) (Release 9)

[24] J Bou Abdo J Demerjian and H Chaouchi ldquoSecurity VSQos for LTE authentication and key agreement protocolrdquoInternational Journal of Network Security amp Its Applications vol4 no 5 pp 71ndash82 2012

[25] A Machanavajjhala D Kifer J Gehrke and M Venkitasub-ramaniam ldquoL-diversity privacy beyond k-anonymityrdquo ACMTransactions on Knowledge Discovery from Data vol 1 no 1article 3 2007

[26] N Li T Li and S Venkatasubramanian ldquot-closeness privacybeyond k-anonymity and l-diversityrdquo in Proceedings of the IEEE23rd International Conference on Data Engineering pp 106ndash115Istanbul Turkey April 2007

[27] X Liu K Liu L Guo X Li and Y Fang ldquoA game-theoreticapproach for achieving k-anonymity in Location Based Ser-vicesrdquo in Proceedings of the 32nd IEEE Conference on ComputerCommunications (IEEE INFOCOM rsquo13) pp 2985ndash2993 TurinItaly April 2013

[28] D Rebollo-Monedero J Forne and J Domingo-Ferrer ldquoFromt-closeness-like privacy to postrandomization via informationtheoryrdquo IEEE Transactions on Knowledge and Data Engineeringvol 22 no 11 pp 1623ndash1636 2010

[29] J B Abdo J Demerjian H Chaouchi T Atechian and CBassil ldquoPrivacy using mobile cloud computingrdquo in Proceedingsof the 5th International Conference on Digital Information andCommunication Technology and Its Applications (DICTAP rsquo15)pp 178ndash182 IEEE Beirut Lebanon May 2015

[30] J B Abdo J Demerjian H Chaouchi K Barbar andG PujolleldquoOperator centric mobile cloud architecturerdquo in Proceedings ofthe IEEE Wireless Communications and Networking Conference(WCNC rsquo14) pp 2982ndash2987 IEEE Istanbul Turkey April 2014

[31] Submitting Your Apple to the Store httpsdeveloperApplecomlibraryiosdocumentationIDEsConceptualAppDistri-butionGuideSubmittingYourAppSubmittingYourApphtml

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 13: Research Article Extended Privacy in Crowdsourced Location ...downloads.hindawi.com/journals/misy/2016/7867206.pdf · popular application o ering crowdsourced location-based service

Mobile Information Systems 13

protectionrdquo in Advances in Spatial and Temporal DatabasesN Mamoulis T Seidl T B Pedersen K Torp and I AssentEds vol 5644 of Lecture Notes in Computer Science pp 98ndash116Springer Berlin Germany 2009

[22] L Sweeney ldquo119896-anonymity a model for protecting privacyrdquoInternational Journal of Uncertainty Fuzziness and Knowledge-Based Systems vol 10 no 5 pp 557ndash570 2002

[23] Generation Partnership Project 3GPP T TR 33821 V900(2009-06) 3GPP Rationale and Track of Security Decisionsin Long Term Evolved (LTE) RAN3GPP System ArchitectureEvolution (SAE) (Release 9)

[24] J Bou Abdo J Demerjian and H Chaouchi ldquoSecurity VSQos for LTE authentication and key agreement protocolrdquoInternational Journal of Network Security amp Its Applications vol4 no 5 pp 71ndash82 2012

[25] A Machanavajjhala D Kifer J Gehrke and M Venkitasub-ramaniam ldquoL-diversity privacy beyond k-anonymityrdquo ACMTransactions on Knowledge Discovery from Data vol 1 no 1article 3 2007

[26] N Li T Li and S Venkatasubramanian ldquot-closeness privacybeyond k-anonymity and l-diversityrdquo in Proceedings of the IEEE23rd International Conference on Data Engineering pp 106ndash115Istanbul Turkey April 2007

[27] X Liu K Liu L Guo X Li and Y Fang ldquoA game-theoreticapproach for achieving k-anonymity in Location Based Ser-vicesrdquo in Proceedings of the 32nd IEEE Conference on ComputerCommunications (IEEE INFOCOM rsquo13) pp 2985ndash2993 TurinItaly April 2013

[28] D Rebollo-Monedero J Forne and J Domingo-Ferrer ldquoFromt-closeness-like privacy to postrandomization via informationtheoryrdquo IEEE Transactions on Knowledge and Data Engineeringvol 22 no 11 pp 1623ndash1636 2010

[29] J B Abdo J Demerjian H Chaouchi T Atechian and CBassil ldquoPrivacy using mobile cloud computingrdquo in Proceedingsof the 5th International Conference on Digital Information andCommunication Technology and Its Applications (DICTAP rsquo15)pp 178ndash182 IEEE Beirut Lebanon May 2015

[30] J B Abdo J Demerjian H Chaouchi K Barbar andG PujolleldquoOperator centric mobile cloud architecturerdquo in Proceedings ofthe IEEE Wireless Communications and Networking Conference(WCNC rsquo14) pp 2982ndash2987 IEEE Istanbul Turkey April 2014

[31] Submitting Your Apple to the Store httpsdeveloperApplecomlibraryiosdocumentationIDEsConceptualAppDistri-butionGuideSubmittingYourAppSubmittingYourApphtml

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 14: Research Article Extended Privacy in Crowdsourced Location ...downloads.hindawi.com/journals/misy/2016/7867206.pdf · popular application o ering crowdsourced location-based service

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014