Near-Optimal Private Approximation Protocols via a Black Box Transformation

36
Near-Optimal Private Approximation Protocols via a Black Box Transformation David Woodruff IBM Almaden

description

Near-Optimal Private Approximation Protocols via a Black Box Transformation. David Woodruff IBM Almaden. Outline. Communication Protocols and Goals Private Approximation Protocols Previous Work Our Results Proof of our Main Transformation. t-Party Communication Model. x 1. x t. x 2. - PowerPoint PPT Presentation

Transcript of Near-Optimal Private Approximation Protocols via a Black Box Transformation

Page 1: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Near-Optimal Private Approximation Protocols

via a Black Box Transformation

David WoodruffIBM Almaden

Page 2: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Outline1. Communication Protocols and Goals

2. Private Approximation Protocols

3. Previous Work

4. Our Results

5. Proof of our Main Transformation

Page 3: Near-Optimal Private Approximation Protocols via a Black Box Transformation

t-Party Communication Model

x2

x1

What is f(x1, x2, …, xt)?

x3 xt-1

xt

Page 4: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Application – IP session data

Source Destination

Bytes Duration

Protocol

18.6.7.110.6.2.311.1.0.612.3.1.5…

19.7.3.212.3.4.811.6.8.214.7.0.1…

40K20K58K30K…

28182232…

httpftphttphttp…

AT & T collects 100+ GBs of NetFlow everyday

Page 5: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Application – IP Session Data

AT & T needs to process massive stream of network data

Traffic estimationWhat fraction of network IP addresses are active?Distinct elements computation

Traffic analysis What are the 100 IP addresses with the most traffic? Frequent items computation

Security/Denial of Service Are there any IP addresses witnessing a spike in traffic? Skewness computation

Page 6: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Application – Secure Datamining

For medical research, hospitals wish to mine their joint data

Patient confidentiality imposes strict laws on what information can be shared. Mining cannot leak anything sensitive

Page 7: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Protocol Goals Communication Complexity: Minimize total

number of bits exchanged between the parties

Round Complexity: Minimize total number of messages exchanged between the parties

Computational Complexity: Minimize workload of the parties

Privacy: No party should learn unnecessary information about another party’s input

Page 8: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Outline1. Communication Protocols and Goals

2. Private Approximation Protocols

3. Previous Work

4. Our Results

5. Proof of our Main Transformation

Page 9: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Initial Observations

Even if the parties are randomized, unless they output approximate answers, the communication is large

How do we cope?

Computing many functions for which the parties are deterministic require a huge amount of communication

Settle for an approximation

Allow randomness and a small chance of error

How do we cope?

This helps with communication, round, and computational complexity, but what is a private randomized approximation?

Page 10: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Privacy Definition

What does privacy mean for approximating a function f?

8 i: Party i does not learn anything about xj

, j i, other than what follows from xi and f(x1, …, xt)

First, what does privacy mean for computing a function f?

8 i: Party i not learn anything about xj, j i, other than what follows from xi and the approximation to f(x1, …, xt)

Not Sufficient!!

MinimalRequirement

Does thiswork?

Page 11: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Privacy Definition

x1 2 {0,1}n x2 2 {0,1}n

Party 1 Party 2

Set the LSB of the approximation f’(x1, x2) to be LSB of x2, and the remaining bits of f’(x1, x2) to agree with those of f(x1, x2)

f’(x1, x2) is a +/- 1 approximation to f(x1, x2), but Alice learns LSB of x2 , which doesn’t follow from x1 and f(x1, x2)

What is the Hamming Distance f(x1, x2) between x1 and x2?

Page 12: Near-Optimal Private Approximation Protocols via a Black Box Transformation

New Privacy Definition [FIMNSW]

What does privacy mean for approximating a function f?

8 i: Party i does not learn anything about xj, j i, other than what follows from xi and f(x1, …, xt)

f’(x1, …, xt) is determined by f(x1, …, xt) and the randomness

NewRequirement

Implications

So, we allow for approximation to reduce communication,but we define privacy with respect to exact computation

Page 13: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Simplifications for This Talk

- We only consider two parties in the rest of the talk

- Their names are Alice and Bob

- Their inputs are x and y

Page 14: Near-Optimal Private Approximation Protocols via a Black Box Transformation

What Can Alice and Bob do to Breach Privacy?

x y

Alice Bob

Semi-honest: parties follow their instructions but try to learn more than what is prescribed

Malicious: parties deviate from the protocol arbitrarily- Use a different input- Force other party to output wrong answer- Abort before other party learns answer

Difficult to achieve security in

malicious model…

Page 15: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Reductions – Yao, GMW, NN

Protocolsecure in thesemi-honest

model

Protocolsecure in the

malicious model

Efficiency of the new protocol =

Efficiency of the old protocol

It suffices to design protocols in the semi-honest model

The parties follow the instructions of the protocol.Don’t need to worry about “weird” behavior.

Just make sure neither party learns anything about the other party’s input, other than what follows from the exact function value

Page 16: Near-Optimal Private Approximation Protocols via a Black Box Transformation

More Simplifications

Complicated Protocol

AliceInput xRandom string rA

BobInput yRandom string

rB

Output f’(x,y)

Using known techniques, just need efficient

simulators SA and SB that depend only on x, y, rA, rB and f(x,y)

Page 17: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Simulators

SA(x, f(x,y))

=negl(n) (rB, y, f’(x,y))

=negl(n) (rA, x, f’(x,y))

SB(y, f(x,y))

Page 18: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Outline1. Communication Protocols and Goals

2. Private Approximation Protocols

3. Previous Work

4. Our Results

5. Proof of our Main Transformation

Page 19: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Known Private Approximations

Communication

Rounds Computation

Papers

Lp-norm0 < p · 2

O*(1) O*(1) O*(n) [IW][KMSZ][MM]

L2-heavy hitters(reveals L2)

O*(1) O*(1) O*(n) [KMSZ]

“Even functions that are efficiently computablefor moderately sized data sets are often not efficiently

computable for massive data sets.” [FIMNSW]

Page 20: Near-Optimal Private Approximation Protocols via a Black Box Transformation

What about all of these problems? Lp-norm for p > 2 and p = 0 Lp-heavy hitters for every p Lp-sampling Max Dominance Norm Distinct Summation Empirical Entropy Cascaded Moments Subspace Approximation L2-distance to independence Etc.

Page 21: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Other Related Work Can privately approximate the

permanent of a matrix [FIMNSW] Some NP-hard problems can be privately

approximated if leak a few bits [HKKN] Many NP-hard problems cannot be

privately approximated even when leaking a large number of bits [BHN]

If answer is not unique, e.g., search problem, private approximations even harder to come by [BCNW]

Page 22: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Outline1. Communication Protocols and Goals

2. Private Approximation Protocols

3. Previous Work

4. Our Results

5. Proof of our Main Transformation

Page 23: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Our Main Transformation• Suppose f =Σi=1

n g(xi, yi)• suppose g is non-negative and efficiently computable

• Let ¦ be an arbitrary non-private protocol for approximating f up to a (1 ± 1/log n)-factor with probability ¸ 2/3

• Then there is a private approximation protocol ¦’ for approximating f up to a (1 ± ε)-factor with probability ¸ 2/3

• The communication, round, and computational complexity of ¦’ agree with that of ¦ up to a poly(log n / ε) factor

Page 24: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Near-Optimal Private Approximation Protocols

Communication

Work

Lp-distance, p > 2Lp-Heavy hitters,Lp-sampling

O*(n1-2/p) O*(1)

Max-Dominance Norm

O*(1) O*(n)

Distinct Summation

O*(1) O*(n)

Empirical Entropy

O*(1) O*(n)

Subspace Approximation

O*(d) O*(nd)

Page 25: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Other Private Approximations

Also obtain near-optimal bounds for: Cascaded frequency moments L2-distance to Independence

Using [BO], we get O*(1) communication for any g(xi, yi) = h(xi-yi) where h has “at most quadratic growth’’

Page 26: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Weaker Assumptions

If non-private protocol ¦ is a “simultaneous protocol”, then it is enough to assume symmetrically private information retrieval with polylog(n) communication [CMS, NP]

Page 27: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Outline1. Communication Protocols and Goals

2. Private Approximation Protocols

3. Previous Work

4. Our Results

5. Proof of our Main Transformation

Page 28: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Main Transformation Given a non-private approximation protocol ¦ for

approximating f(x,y) = Σi=1n g(xi, yi), we design a private

approximation protocol ¦’

Main Theorem: There is a low-communication importance sampling procedure which:

If B is an upper bound on f(x,y),

Then Alice and Bob sample from a distribution ¹ on [n] [ ? :8 i 2 [n], ¹(i) = g(xi, yi)/B

¹(?) = 1- f(x,y)/B

How do we design ¦’

given such a procedure?

Page 29: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Importance Sampling Procedureobtains samples from [n] [ ?.

1-Pr [obtain ?] = f(x,y)/B

Private Approximation Protocol

Thus, this probability depends only on f(x,y)!

1. Let B be an upper bound on f(x,y)2. The protocol outputs a bit c. 3. Since c is a bit, it is determined from its expectation.

Pr[c = 1] = 1-Pr[obtain ?] = f(x,y)/B · 1

Repeat a few times to get

concentration

If most repetitions return c = 0,

replace B with B/2, and repeat

The process of halving B

depends only on f(x,y), which

helps for simulation

Once B < 2f(x,y), with very high

probability, enough coin tosses are 1

Page 30: Near-Optimal Private Approximation Protocols via a Black Box Transformation

What’s left? Need an importance sampling procedure, and

show our overall approximation protocol is simulatable

We can’t sample exactly from ¹ on [n] [ ? : 8 i 2 [n], ¹(i) = g(xi, yi)/B

¹(?) = 1- f(x,y)/B

We can sample from a distribution with negl(n) distance from ¹

Page 31: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Notation

For input vectors x and y,

let f[a,b] = Σi=ab g(xi, yi)

Page 32: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Importance Sampling

x, rA y, rB

¦ is a non-private protocol for (1/log n, negl(n))-approximating f = Σi=1

n g(xi, yi),

Use ¦ to estimate f[1, n/2], obtaining f*[1, n/2]Use ¦ to estimate f[n/2+1, n], obtaining f*[n/2+1, n]Recurse on [1, n/2] with probability

f*[1,n/2]/(f*[1,n/2] + f*[n/2+1, n])Else recurse on [n/2+1, n]

f*[1, n/2] is a (1 ± 1/log n)-approximation to f[n/2]

Page 33: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Importance Samplingf[1,8]

f[1,4] f[5,8]

f[1,2] f[3,4]

g(x3, y3) g(x4, y4)

With probability f*[1,4]/(f*[1,4] + f*[5,

8])go left, else go rightWith probability

f*[1,2]/(f*[1,2] + f*[3, 4])

go left, else go rightWith probability g(x3, y3)/(g(x3, y3)+g(x4,

y4))go left, else go right

Pr[g(x3, y3) chosen] =

f*[1,4]/(f*[1,4]+f*[5,8])x

f*[3,4]/(f*[1,2]+f*[1,4])x

g(x3, y3)/(g(x3, y3)+g(x4, y4))=

C*g(x3, y3)/f(x,y)

Page 34: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Importance Sampling Procedure gives a way to sample from a distribution ½:

½(i) = Ci ¢ g(xi,yi)/f(x,y),where Ci 2 [1/2, 2]

If i is sampled, then we know the probability ½(i) that we chose it

We can also obtain g(xi, yi) efficiently

With probability g(xi,yi)/(½(i)¢B), output i, else output ? !

Pr[don’t output ?] = i ½(i)¢g(xi,yi)/(½(i)¢B)= f(x,y)/B

Hence, we sample from ¹:

8 i 2 [n], ¹(i) = g(xi, yi)/B ¹(?) = 1- f(x,y)/B

(up to negl(n), since small probability ¦ fails)

Page 35: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Simulators

For f’(x,y) , SA generates random coins with expectation f(x,y)/B, and keeps halving B until there are enough coin tosses equal to 1

For rA, SA outputs a random rA SA outputs (rA, x, f’(x,y)) which is equal to

the distribution in ¦’ except with negl(n) probability

SA(x, f(x,y)) =negl(n) (rA, x, f’(x,y))

Page 36: Near-Optimal Private Approximation Protocols via a Black Box Transformation

Conclusions Any non-private approximation protocol for a

function f = Σi=1n g(xi, yi) can be transformed into a

private one with an O*(1) blowup in complexity

Many problems can be expressed this way (e.g., lp-norms), even non-obvious ones (e.g., entropy), for which we had no technique of achieving a private approximation

What about other functions?