The world’s libraries. Connected. Managing your Private and Public Data: Bringing down Inference...

download The world’s libraries. Connected. Managing your Private and Public Data: Bringing down Inference Attacks against your Privacy Group Meeting in 2015.

If you can't read please download the document

description

The world’s libraries. Connected. Background (1) Online users are routinely asked to provide feedback about their preferences and tastes Group Meeting in 2015

Transcript of The world’s libraries. Connected. Managing your Private and Public Data: Bringing down Inference...

The worlds libraries. Connected. Managing your Private and Public Data: Bringing down Inference Attacks against your Privacy Group Meeting in 2015 The worlds libraries. Connected. Interested Papers Salman Salamatian, Amy Zhang, Flvio du Pin Calmon, Sandilya Bhamidipati, Nadia Fawaz, Branislav Kveton, Pedro Oliveira, Nina Taft: Managing Your Private and Public Data: Bringing Down Inference Attacks Against Your Privacy. J. Sel. Topics Signal Processing 9(7): (2015) Stratis Ioannidis, Andrea Montanari, Udi Weinsberg, Smriti Bhagat, Nadia Fawaz, Nina Taft: Privacy tradeoffs in predictive analytics. SIGMETRICS 2014: Group Meeting in 2015 The worlds libraries. Connected. Background (1) Online users are routinely asked to provide feedback about their preferences and tastes Group Meeting in 2015 The worlds libraries. Connected. Motivation (1) Public data such data enables useful services such attributes ate rarely considered private Private data such as income level, political affiliation..... Group Meeting in 2015 Inference attack: adversary can learn users private data from the public information The worlds libraries. Connected. Motivation (2) Inference threats from user data is everywhere Group Meeting in 2015 Narayanan et al. show that disclosure of movie ratings can lead to full de- anonymization Kosinski et al. show that several personality traits, including political views, sexual orientation, and drug use can be accurately predicted from Facebook " "likes" Weinsberg et al. show that gender can be inferred from movie ratings with close to 80% accuracy The worlds libraries. Connected. Problem Definition How can a user release her public data, but is able to prevent against inference attacks that may learn her private data from the public information? Group Meeting in 2015 The worlds libraries. Connected. Method Distorting data prior to its release to an untrusted analyst Distortion vs. estimation accuracy tradeoffs X X Y Y Y Y f(X) A distortion Mutual information < threshold Public data Private data Group Meeting in 2015 The worlds libraries. Connected. Content Filtering Approach (2) should render: Any statistical inference of A based on the observation of harder Preserves some utility to the released data Group Meeting in 2015 This can be modeled by a constraint on the average distortion The worlds libraries. Connected. Threat Model (1) : the vector of personal attributes that the user wants to keep private : the vector of data he is willing to make public : the sets from which A and B can assume values Group Meeting in 2015 A are linked to B by p A, B Release a distorted version of B, denoted : privacy-preserving mapping The worlds libraries. Connected. Threat Model (2) Group Meeting in 2015 Step 1: prior to observing, the adversary chooses a belief on the data A The belief q is obtained by minimizing an expected cost function Step 2: after observing, the adversary update his inference method by minimizing The worlds libraries. Connected. Threat Model (3) Group Meeting in 2015 The average cost gain by the adversary after observing Goal: minimize thus gain. Perfect privacy: How to compute the gain ? The worlds libraries. Connected. Preparation Group Meeting in 2015 The self information (or log-loss) cost function is given by Explanation Differential entropy Hence, for the minimum q The worlds libraries. Connected. Preparation Group Meeting in 2015 The self information (or log-loss) cost function is given by Differential entropy Explanation Hence, for the minimum q H(A | ) The worlds libraries. Connected. Preparation Group Meeting in 2015 Combine Hence H(A | ) = H(A) H(A | ) = H(A, ) H( ) Moreover The worlds libraries. Connected. Threat Model (2) Hence Group Meeting in 2015 I(A, B) = H(A) + H(B) H(A, B) = H(A) + H(B) (H(A | B) + H(B)) = H(A) H(A | B) d The worlds libraries. Connected. Privacy-Accuracy Framework Group Meeting in 2015 Since Substitute the above equation into the definition of Denote: The worlds libraries. Connected. Motivation (1) Group Meeting in 2015 The worlds libraries. Connected. Practical challenges Group Meeting in 2015 Challenge 1: Mismatched prior Prior distribution is as input a. Often the true prior distribution is not available b. Only a limited set of samples of the private and public data can be observed Privacy non-conscious Challenge 2: Large data Designing the mapping requires characterizing the value of for all possible pairs The worlds libraries. Connected. Solve the First Challenge is a good estimate of Group Meeting in 2015 Inference attack: adversary can learn users private data from the public information The worlds libraries. Connected. Solve the First Challenge Bound on the probability of being large Group Meeting in 2015 The worlds libraries. Connected. Solve the Second Challenge Step 1: quantization step maps the symbols in alphabet representative examples in a smaller alphabet Group Meeting in 2015 Step 2: we learn a privacy-preserving mapping on the new alphabet, where Step 3: the symbols in are mapped to the representative examples based on the learned mapping The worlds libraries. Connected. Preparation Group Meeting in 2015 The worlds libraries. Connected. Threat Model (3) Group Meeting in 2015 The joint probability distribution over A and C is defined as where b c means that the symbol b is in the cluster represented by center c The symbols in are mapped to according to where is a function that maps a symbol in B to a cluster center in C The worlds libraries. Connected. Threat Model (3) Group Meeting in 2015