Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear...
Transcript of Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear...
![Page 1: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/1.jpg)
Successes of Differential Privacy
Cynthia Dwork, Harvard University
![Page 2: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/2.jpg)
Pre-Modern Cryptography
Propose
Break
![Page 3: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/3.jpg)
Modern CryptographyPropose
STRONGERDefinition
Break Definition
Propose
Definition
Break Definition
algorithms
satisfying
definition
Algs
Propose
STRONGER
![Page 4: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/4.jpg)
Modern Cryptography
Propose
Definition
Break Definition
Propose
STRONGERDefinition
Break Definition
algorithms
satisfying
definition
Algs
Propose
STRONGER
![Page 5: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/5.jpg)
No Algorithm?
Propose
Definition
?
Why?
![Page 6: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/6.jpg)
Provably No Algorithm?
Bad Definition
Propose
Definition
?
Propose
WEAKER/DIFFDefinition
Alg / ?
![Page 7: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/7.jpg)
Scientific Launch1. Methodology
2. Engaging with negative results
Dinur-Nissim
![Page 8: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/8.jpg)
Fundamental Law of Info Recovery “Overly accurate” estimates of “too many” statistics destroys
privacy.
![Page 9: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/9.jpg)
Scientific Launch1. Methodology
2. Engaging with negative results
Dinur-Nissim; impossibility of semantic security (Terry Gross)
3. Algorithmic Approach
Privacy-preserving programming from a few primitives
RR, symmetric noise, EM: the ORs and ANDs of DP
The astonishing Blum-Ligett-Roth result
Composition
Analytical insights: sparse vector and PMW; geometric view
4. Complexity
![Page 10: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/10.jpg)
Fruitful Interplay with Other Fields Learning theory, discrepancy theory, cryptography, geometry,
complexity theory, mechanism design, pseudorandomness,
communication complexity, machine learning, (robust) statistics,
fingerprinting codes, coding theory
![Page 11: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/11.jpg)
Rich Algorithmic Literature
Counts, linear queries, histograms, contingency tables (marginals)
Location and spread (eg, median, interquartile range)
Dimension reduction (PCA, SVD), clustering
Support Vector Machines
Sparse regression/LASSO, logistic and linear regression
Gradient descent
Boosting, Multiplicative Weights
Combinatorial optimization, mechanism design
Privacy Under Continual Observation, Pan-Privacy
Kalman filtering
Statistical Queries learning model, PAC learning
False Discovery Rate control
Pan-Privacy, privacy under continual observation …
![Page 12: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/12.jpg)
Outreach
![Page 13: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/13.jpg)
Formative engagement with statistics
Led to earliest public deployment
![Page 14: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/14.jpg)
Social Science Research
![Page 15: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/15.jpg)
Law, Economics, Medicine,…
PLSC, Berkman, Brussels, Simons Foundation, EC, iDASH,…
Omics: Stanford (past); IPAM (upcoming); Society of
Epidemeoligic Research
![Page 16: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/16.jpg)
Policy
![Page 17: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/17.jpg)
Policy
CPUC hearings on Energy Data Center, the ruling, the Southern
CA power company
Podesta report, PCAST report
Commission on Evidence-Based Policymaking
Consumer Finance Protection Board
…
![Page 18: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/18.jpg)
Deployment RAPPOR, Google more generally, Apple,…
A couple of startups (Leapyear, Privatar(?))
Census – OnTheMap and upcoming
Help wanted!
![Page 19: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/19.jpg)
Deployment RAPPOR, Google more generally, Apple,…
Help wanted!
A couple of startups (Leapyear, Privatar(?))
Help wanted!
Census – OnTheMap and upcoming
Help wanted!
![Page 20: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/20.jpg)
Deployment RAPPOR, Google more generally, Apple,…
Help wanted!
A couple of startups (Leapyear, Privatar(?))
Help wanted!
Census – OnTheMap and upcoming
Help wanted!
![Page 21: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/21.jpg)
DP when Privacy is not a Concern Markets, Economics, Game Theory
Hartline, McSherry,Talwar; Roth; Pai and Roth; Lykouris, Syrgkanis,
and Tardos
Fairness in Algorithmic Classification
Generalizability under adaptive analysis
![Page 22: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/22.jpg)
Fairness Through Awareness
Dwork, Hardt, Pitassi, Reingold, Zemel 2012
![Page 23: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/23.jpg)
Individual Fairness People who are similar with respect to a specific classification task
should be treated similarly
S + math ∼ Sc + finance
“Fairness Through Awareness”
V: individuals
M: 𝑉 → 𝑂
𝑥
M𝑥
O: Classification
Outcomes
Classifier
![Page 24: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/24.jpg)
Classifier
V O
Lipschitz
𝑥
𝑦
tiny d
𝑀
Individual Fairness
𝑀:𝑉 → Δ 𝑂
𝑀 𝑥 −𝑀 𝑦 ≤ 𝑑(𝑥, 𝑦)
![Page 25: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/25.jpg)
Lipschitz Mappings
Differential Privacy Individual Fairness
Objects Databases Individuals
Outcomes Output of statistical analysis Classification outcome
Similarity General purpose metric Task-specific metric
Can use dp techniques for fairness
Theorem: Exponential mechanism of [MT07] yields individual fairness
and small loss when the metric has bounded doubling dimension.
![Page 26: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/26.jpg)
Which is “Right”?
![Page 27: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/27.jpg)
Statistical Validity in Adaptive Data Analysis
Dwork, Feldman, Hardt, Pitassi, Reingold, Roth
![Page 28: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/28.jpg)
𝑞𝑖 depends on 𝑎1, 𝑎2, … , 𝑎𝑖−1 Differential privacy neutralizes risks incurred by adaptivity
Hard to find a query for which the data set is not representative
q1
a1
Database curator data analyst
Mq2
a2
q3
a3
![Page 29: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/29.jpg)
The Re-Usable Holdout
“Training”
“Holdout”
Learn on the training set
Check against holdout via a
differentially private mechanism
Future exploration does not
significantly depend on H
H stays fresh
![Page 30: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/30.jpg)
3 Sides of the Same Coin Fairness, Privacy, Generalizability
![Page 31: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/31.jpg)
“Keep Up the Good Work” – Moni Naor (by channeling)
Let your research be fruitful and multiply
Build the 𝜖 registry, formally or informally
Build libraries, continue outreach efforts
Confront Implications of the Fundamental Law
Prioritization? Who decides? Which fields have the tools
Public Understanding
Generalization beyond the sample distribution / transfer learning?
Strong relation to fairness
![Page 32: Successes of Differential Privacy · 2020-01-03 · Rich Algorithmic Literature Counts, linear queries, histograms, contingency tables (marginals) Location and spread (eg, median,](https://reader033.fdocuments.net/reader033/viewer/2022050308/5f70212cbc59827f0d0de0c3/html5/thumbnails/32.jpg)
Thank You