Privacy Preserving Data Mining on Moving Object Trajectories

16
Privacy Preserving Data Mining on Moving Object Trajectories Győző Gidófalvi Geomatic ApS Center for Geoinformatik Xuegang Harry Huang Torben Bach Pedersen Aalborg University

description

Privacy Preserving Data Mining on Moving Object Trajectories. Gy ő z ő Gid ó falvi Geomatic ApS Center for Geoinformatik Xuegang Harry Huang Torben Bach Pedersen Aalborg University. Outline. Need for location privacy Quantifying location privacy - PowerPoint PPT Presentation

Transcript of Privacy Preserving Data Mining on Moving Object Trajectories

Page 1: Privacy Preserving Data Mining on  Moving Object Trajectories

Privacy Preserving Data Mining on

Moving Object Trajectories

Győző GidófalviGeomatic ApSCenter for Geoinformatik

Xuegang Harry HuangTorben Bach PedersenAalborg University

Page 2: Privacy Preserving Data Mining on  Moving Object Trajectories

2 MDM, Mannheim 2007

Outline

Need for location privacy

Quantifying location privacy

Existing methods / systems for location privacy

Undesired loss of privacy

Privacy-preserving, grid-based framework for collecting and mining location data

System architecture Anonymization / partitioning policies Mining of anonymized trajectories Experiments and results

Conclusions

Page 3: Privacy Preserving Data Mining on  Moving Object Trajectories

3 MDM, Mannheim 2007

Need for Location Privacy

“New technologies can pinpoint your location at any

time and place. They promise safety and convenience

but threaten privacy and security”

Cover story, IEEE Spectrum, July 2003

Page 4: Privacy Preserving Data Mining on  Moving Object Trajectories

4 MDM, Mannheim 2007

Location-Based Services

With all its privacy threats, why do users still use location-detection devices?

Wide spread of LBSs Location-based store finders Location-based traffic reports Location-based advertisements

LBSs rely on the implicit assumption that users agree on revealing their private user locations

LBSs trade their services with privacy

LBSs make heavy use of context, including patterns in the location data, which is extracted using data mining methods.

Page 5: Privacy Preserving Data Mining on  Moving Object Trajectories

5 MDM, Mannheim 2007

Quantifying Location Privacy

User perception of privacy: “One should only be able to infer a region R that I am in”

Things to consider: external spatio-temporal knowledge Location and movement of objects is limited in space and time

Road networks and other geographical constraintsLocation of businessesOpening hours

Locations of other moving objects in a given region

Common measures for location privacy: The area of R >= A_min k-anonymity: there should be at least (k-1) other moving objects in

R

Page 6: Privacy Preserving Data Mining on  Moving Object Trajectories

6 MDM, Mannheim 2007

Existing Methods / Systems for Location Privacy

Considering k-anonymity only: Adaptive-Interval Cloaking Algorithm (MobiSys’03)

Divide the entire system area into quadrants of equal size, until the quadrant includes the user and k-1 other users

Clique-Cloak Algorithm (ICDCS’05)A clique graph is constructed to search for a minimum bounding rectangle that includes the user and k-1 other users

Considering both k-anonymity and minimum area CASPER: Adaptive Location Anonymizer + Privacy-Aware Query Processor (VLDB’06)

Grid-based pyramid structure to put the exact location of mobile users into cloaking rectangles made of grid cells

Page 7: Privacy Preserving Data Mining on  Moving Object Trajectories

7 MDM, Mannheim 2007

Problems with Existing Systems / Methods

1)Requires trusted middleware

2)Providing k-anonymity in environments with untrusted components is unknown and likely to be computationally prohibitive.

3)Notion of location privacy guaranteed by k-anonymity may not be satisfactory (large number of users in a small area where they do not want to be observed).

4)Non-deterministically or probabilistically reporting different sized and/or positioned cloacking rectangles for the same location sacrifices location privacy.

5)Traditional mining methods cannot be easily / efficiently extended to the anonymized location data.

Page 8: Privacy Preserving Data Mining on  Moving Object Trajectories

8 MDM, Mannheim 2007

Undesired Loss of Privacy: Example

By studying the “anonymized” locations of a single user,

one can conclude and quantify the likelihood that the

user is in the intersection of the cloaked rectangles

returned on subsequent visits.

Page 9: Privacy Preserving Data Mining on  Moving Object Trajectories

9 MDM, Mannheim 2007

Privacy-preserving, grid-based framework for collecting and mining location data

Goals: Avoid privacy loss Avoid using trusted middleware component Allow users to specify desired level of privacy Obtain detailed and accurate patterns from anonymized data

Approach: Deterministic mapping from exact locations to anonymization

rectangles based on a single predefined 2D grid and user privacy settings

Smart clients are responsible for constructing anonymization rectangles by partitioning the space based on the 2D grid

Anonymization rectangles are constructed so that one can only infer with probability <= maxLocProb which grid cell the user is in

Mine probabilistic patterns from probabilistic location data

Page 10: Privacy Preserving Data Mining on  Moving Object Trajectories

10 MDM, Mannheim 2007

System Architecture

Page 11: Privacy Preserving Data Mining on  Moving Object Trajectories

11 MDM, Mannheim 2007

Partitioning Policies

CRP: Common Regular Partitioning All users use a common regular partitioning Guarantees the same, in-space uniform privacy

level for all users

IRP: Individual Regular Partitioning Every user uses his/her own regular partitioning Guarantees in-space uniform, individual privacy

levels to users

IIP: Individual Irregular Partitioning Every user specifies a few private locations with

corresponding desired levels of privacy, and constructs a partitioning that meets these requirements

Guarantees in-space non-uniform, individual privacy levels to users

Home

Office

Page 12: Privacy Preserving Data Mining on  Moving Object Trajectories

12 MDM, Mannheim 2007

Mining of Anonymized Trajectories

Probabilistic data: A location of an object is reported in terms of an anonymization

rectangles (i.e., a set of (n) grid cells) The object is inside grid cell c_i with location probability c_i.prob

(=1/n)

Probabilistic Dense Spatio-Temporal Area Query: During a time period find all grid cells where

the maximum number of objects is >= min_count and the average location probability of the cell is >= min_prob

Evaluation of query results: Set of true patterns D False negative (N), is the error of not finding a pattern that does exist

in the data FNR = |N| / |D| False positive (P), is the error of finding a “pattern” that does not exist

in the data FPR = |P| / |D|

Page 13: Privacy Preserving Data Mining on  Moving Object Trajectories

13 MDM, Mannheim 2007

Experiment Setup

DATA: 600-3000 trajectories from Brinkhoff’s network-based generator of moving objects

4 types of partitioning based on the 3 policies: 2x2 CRP and 4x4 CRP IRP (at most 4x4 grid cells / partition) IIR (at most 4x4 grid cells at the start and end of trajectories

otherwise 1x1)

Evaluation of mining accuracy (FPR & FNR) under varying:

Parameter settings for min_count and min_prob Grid size Time span Number of trajectories

Page 14: Privacy Preserving Data Mining on  Moving Object Trajectories

14 MDM, Mannheim 2007

Results

Few false negatives

The number of false positives increases with min_count and grid size, and decreases with time span and the number of trajectories

For each policy, min_prob can be tuned to find an optimal situation (FP vs. FN)

Page 15: Privacy Preserving Data Mining on  Moving Object Trajectories

15 MDM, Mannheim 2007

Conclusions

Non-deterministic generalization can lead to privacy loss

Presented a grid-based framework for anonymized data collection and mining that:

Does not require a trusted middleware Maps exact locations to anonymization rectangles in a deterministic

way according to one of three anonymization policies: CRP, IRP, IIP Avoids privacy loss Allows users to specify individual desired privacy levels Allows to extend traditional data mining methods and discover

accurate patters while meeting the privacy requirements Is computationally effective and conceptually simple

Page 16: Privacy Preserving Data Mining on  Moving Object Trajectories

16 MDM, Mannheim 2007

Thank you for your attention!