On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering,...

25
On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering, Hebrew University Joint work with Shaull Almagor, Assaf Michaely and Jeffrey S. Rosenschein
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    216
  • download

    0

Transcript of On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering,...

Page 1: On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering, Hebrew University Joint work with Shaull Almagor, Assaf.

On the Limits of Dictatorial Classification

Reshef MeirSchool of Computer Science and Engineering, Hebrew University

Joint work with Shaull Almagor, Assaf Michaely and Jeffrey S. Rosenschein

Page 2: On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering, Hebrew University Joint work with Shaull Almagor, Assaf.

Strategy-Proof Classification

• An Example

• Motivation

• Our Model and previous results

• Filling the gap: proving a lower bound

• The weighted case

Page 3: On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering, Hebrew University Joint work with Shaull Almagor, Assaf.

ERM

Motivation Model Results

Strategic labeling: an example

Introduction

5 errors

Page 4: On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering, Hebrew University Joint work with Shaull Almagor, Assaf.

There is a better classifier! (for me…)

Motivation Model ResultsIntroduction

Page 5: On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering, Hebrew University Joint work with Shaull Almagor, Assaf.

If I just change the

labels…

Motivation Model ResultsIntroduction

2+5 = 7 errors

Page 6: On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering, Hebrew University Joint work with Shaull Almagor, Assaf.

ClassificationThe Supervised Classification problem:

– Input: a set of labeled data points {(xi,yi)}i=1..m

– output: a classifier c from some predefined concept class C ( e.g., functions of the form f : X{-,+} )

– We usually want c to classify correctly not just the sample, but to generalize well, i.e., to minimize

R(c) ≡the expected number of errors w.r.t. the distribution D

(the 0/1 loss function)

Motivation ResultsIntroduction Model

E(x,y)~D[ c(x)≠y ]

Page 7: On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering, Hebrew University Joint work with Shaull Almagor, Assaf.

Classification (cont.)• A common approach is to return the ERM

(Empirical Risk Minimizer), i.e., the concept in C that is the best w.r.t. the given samples (has the lowest number of errors)

• Generalizes well under some assumptions on the concept class C (e.g., linear classifiers tend to generalize well)

With multiple experts, we can’t trust our ERM!

Motivation ResultsIntroduction Model

Page 8: On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering, Hebrew University Joint work with Shaull Almagor, Assaf.

Where do we find “experts” with incentives?

Example 1: A firm learning purchase patterns– Information gathered from local retailers– The resulting policy affects them – “the best policy, is the policy that fits my pattern”

Introduction Model ResultsMotivation

Page 9: On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering, Hebrew University Joint work with Shaull Almagor, Assaf.

Users Reported Dataset

Classification AlgorithmClassifier

Introduction Model Results

Example 2: Internet polls / polls of experts

Motivation

Page 10: On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering, Hebrew University Joint work with Shaull Almagor, Assaf.

Introduction Model Results

Motivation from other domains

Motivation

Aggregating partitions

Judgment aggregation

Facility location (on the binary cube)

Agent A B A & B A | ~B

T F F T

F T F F

F F F T

Page 11: On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering, Hebrew University Joint work with Shaull Almagor, Assaf.

A problem instance is defined by

• Set of agents I = {1,...,n}• A set of data points

X = {x1,...,xm} X• For each xkX agent i has a label yik{,}

– Each pair sik=xk,yik is a sample– All samples of a single agent compose the labeled dataset

Si = {si1,...,si,m(i)} • The joint dataset S= S1 , S2 ,…, Sn is our input

– m=|S|• We denote the dataset with the reported labels by S’

Introduction Motivation ResultsModel

Page 12: On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering, Hebrew University Joint work with Shaull Almagor, Assaf.

Agent 1 Agent 2 Agent 3

Input: Example

––

+

+

X Xm

Y1 {-,+}m Y2 {-,+}m Y3 {-,+}m

S = S1, S2,…, Sn = (X,Y1),…, (X,Yn)

Introduction Motivation ResultsModel

–+

+

-

-

–+

+

-

+

+

Page 13: On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering, Hebrew University Joint work with Shaull Almagor, Assaf.

Mechanisms

• A Mechanism M receives a labeled dataset S and outputs c = M(S) C

• Private risk of i: Ri(c,S) = |{k: c(xik) yik}| / mi

• Global risk: R(c,S) = |{i,k: c(xik) yik}| / m

• We allow non-deterministic mechanisms– Measure the expected risk

Introduction Motivation ResultsModel

% of errors on Si

% of errors on S

Page 14: On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering, Hebrew University Joint work with Shaull Almagor, Assaf.

ERM

We compare the outcome of M to the ERM:c* = ERM(S) = argmin(R(c),S)r* = R(c*,S)

c C

Can our mechanism simply compute and return the ERM?

Introduction Motivation ResultsModel

Page 15: On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering, Hebrew University Joint work with Shaull Almagor, Assaf.

(Lying)

Requirements

1. Good approximation: S R(M(S),S) ≤ α∙r*

2. Strategy-Proofness (SP): i,S,Si‘ Ri(M(S-i , Si‘),S) ≥ Ri(M(S),S)

• ERM(S) is 1-approximating but not SP• ERM(S1) is SP but gives bad approximation

Are there any mechanisms

that guarantee both SP and

good approximation?

Introduction Motivation ResultsModel

MOST IMPORTANT

SLIDE

(Truth)

Page 16: On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering, Hebrew University Joint work with Shaull Almagor, Assaf.

• A study of SP mechanisms in Regression learning

– O. Dekel, F. Fischer and A. D. Procaccia, SODA (2008), JCSS (2009). [supervised learning]

• No SP mechanisms for Clustering

– J. Perote-Peña and J. Perote, Economics Bulletin (2003) [unsupervised learning]

Introduction Motivation Model Results Related work

Page 17: On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering, Hebrew University Joint work with Shaull Almagor, Assaf.

Results

A simple case

• Tiny concept class: |C|= 2• Either “all positive” or “all negative”

Theorem: • There is a SP 2-approximation mechanism• There are no SP α-approximation mechanisms,

for any α<2

Introduction Motivation Model

Meir, Procaccia and Rosenschein, AAAI 2008

Previous work

Page 18: On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering, Hebrew University Joint work with Shaull Almagor, Assaf.

Results

General concept classes

Theorem: Selecting a dictator at random is SP and guarantees approximation

– True for any concept class C– Generalizes well from sampled data when C has a

bounded VC dimension

Open question #1: are there better mechanisms?Open question #2: what if agents are weighted?

Introduction Motivation Model

Meir, Procaccia and Rosenschein, IJCAI 2009

Previous work

n23

Page 19: On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering, Hebrew University Joint work with Shaull Almagor, Assaf.

A lower boundIntroduction Motivation Model Results

Theorem: There is a concept class C (where |C|=3), for which any SP mechanism has an approximation ratio of at least n

23

Our main result:

o Matching the upper bound from IJCAI-09

o Proof is by a careful reduction to a voting scenario

o We will see the proof sketch

Page 20: On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering, Hebrew University Joint work with Shaull Almagor, Assaf.

Proof sketchIntroduction Motivation Model Results

Gibbard [‘77] proved that every (randomized) SP voting rule for 3 candidates, must be a lottery over dictators*.

We define X = {x,y,z}, and C as follows:

We also restrict the agents, so that each agent can have mixed labels on just one point

x y zcx + - -

cy - + -

cz - - +

x y z- - - - - - - - ++++ - - - - ++++++++

++++++++ - - - - - - - - ++ - - - - - -

Page 21: On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering, Hebrew University Joint work with Shaull Almagor, Assaf.

Proof sketch (cont.)Introduction Motivation Model Results

x y z- - - - - - - - ++++ - - - - ++++++++

++++++++ - - - - - - - - ++ - - - - - -

Suppose that M is SP

Page 22: On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering, Hebrew University Joint work with Shaull Almagor, Assaf.

Proof sketch (cont.)Introduction Motivation Model Results

x y z- - - - - - - - ++++ - - - - ++++++++

++++++++ - - - - - - - - ++ - - - - - -

Suppose that M is SP

1. M must be monotone on the mixed point

2. M must ignore the mixed point

3. M is a (randomized) voting rule

cz > cy > cx

cx > cz > cy

Page 23: On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering, Hebrew University Joint work with Shaull Almagor, Assaf.

Proof sketch (cont.)Introduction Motivation Model Results

x y z- - - - - - - - ++++ - - - - ++++++++

++++++++ - - - - - - - - ++ - - - - - -

4. By Gibbard [‘77], M is a random dictator

5. We construct an instance where random dictators perform poorly

cz > cy > cx

cx > cz > cy

31

32

Page 24: On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering, Hebrew University Joint work with Shaull Almagor, Assaf.

Weighted agentsIntroduction Motivation Model Results

• We must select a dictator randomly

• However, probability may be based on weight

• Naïve approach: o Only gives 3-approximation

• An optimal SP algorithm:o Matches the lower bound of

iwipr )(

)1(2)(

i

i

w

wipr

n23

Page 25: On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering, Hebrew University Joint work with Shaull Almagor, Assaf.

Future work• Other concept classes

• Other loss functions (linear loss, quadratic loss,…)

• Alternative assumptions on structure of data

• Other models of strategic behavior

• …

Introduction Motivation Model Results