Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost...

37
Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University

Transcript of Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost...

Page 1: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

Get Another Label? Using Multiple, Noisy Labelers

Joint work with Victor Sheng and Foster Provost

Panos Ipeirotis

Stern School of BusinessNew York University

Page 2: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

2

Motivation

Many task rely on high-quality labels for objects:– relevance judgments

– duplicate database records

– image recognition

– song categorization

– videos

Labeling can be relatively inexpensive, using Mechanical Turk, ESP game …

Page 3: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

ESP Game (by Luis von Ahn)

3

Page 4: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

Mechanical Turk Example

“Are these two documents about the same topic?”

4

Page 5: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

Mechanical Turk Example

5

Page 6: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

6

Motivation

Labels can be used in training predictive models – Duplicate detection systems

– Image recognition

– Web search

But: labels obtained from above sources are noisy. This directly affects the quality of learning models

– How can we know the quality of annotators?

– How can we know the correct answer?

– How can we use best noisy annotators?

Page 7: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

7

40

50

60

70

80

90

100

1 20 40 60 80 100

120

140

160

180

200

220

240

260

280

300

Number of examples (Mushroom)

Acc

ura

cyQuality and Classification Performance

Labeling quality increases classification quality increases

Q = 0.5

Q = 0.6

Q = 0.8

Q = 1.0

Page 8: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

8

How to Improve Labeling Quality

Find better labelers– Often expensive, or beyond our control

Use multiple, noisy labelers: repeated-labeling– Our focus

Page 9: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

9

Multiple labelers and resulting label quality

Multiple labelers and classification quality

Selective label acquisition

Our Focus: Labeling using Multiple Noisy Labelers

Page 10: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

10

Majority Voting and Label Quality

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 3 5 7 9 11 13

Number of labelers

Inte

grat

ed q

ualit

y

P=0.4

P=0.5

P=0.6

P=0.7

P=0.8

P=0.9

P=1.0

Ask multiple labelers, keep majority label as “true” label

Quality is probability of majority label being correct

P is probabilityof individual labelerbeing correct

Page 11: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

So…

(Sometimes) quality of multiple noisy labelers better than quality of best labeler in set

11

Multiple noisy labelers improve quality

So, should we always get multiple labels?

Page 12: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

12

Tradeoffs for Classification

Get more labels Improve label quality Improve classification Get more examples Improve classification

40

50

60

70

80

90

100

1 20 40 60 80 100

120

140

160

180

200

220

240

260

280

300

Number of examples (Mushroom)

Acc

ura

cy

Q = 0.5

Q = 0.6

Q = 0.8

Q = 1.0

Page 13: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

13

Basic Labeling Strategies

Get as many data points as possible, one label each

Repeatedly-label everything, same number of times

Page 14: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

14

Repeat-Labeling vs. Single Labeling

P= 0.6, labeling qualityK=5, #labels/example

Repeated

Single

With high noise, repeated labeling better than single labeling

Page 15: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

15

Repeat-Labeling vs. Single Labeling

P= 0.8, labeling qualityK=5, #labels/example

Repeated

Single

With low noise, more (single labeled) examples better

Page 16: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

Estimating Labeler Quality

(Dawid, Skene 1979): “Multiple diagnoses”

– Assume equal qualities– Estimate “true” labels for examples– Estimate qualities of labelers given the “true” labels– Repeat until convergence

16

Page 17: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

17

Selective Repeated-Labeling

We have seen: – With noise and enough (noisy) examples getting

multiple labels better than single-labeling

Can we do better?

Select data points, in terms of uncertainty score, to allocate multi-label resource, e.g. {+,-,+,+,-,+,+} vs. {+,+,+,+}

Page 18: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

18

Natural Candidate: Entropy

Entropy is a natural measure of label uncertainty:

E({+,+,+,+,+,+})=0 E({+,-, +,-, +,- })=1

Strategy: Get more labels for high-entropy examples

||

||log

||

||

||

||log

||

||)( 22 S

S

S

S

S

S

S

SSE

negativeSpositiveS |:||:|

Page 19: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

19

What Not to Do: Use Entropy

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

0 400 800 1200 1600 2000

Number of labels (waveform, p=0.6)

Lab

eli

ng

qu

ali

ty

ENTROPYUNF

Improves at first, hurts in long run

EntropyRound robin

Page 20: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

Why not Entropy

In the presence of noise, entropy will be high even with many labels

Entropy is scale invariant – (3+ , 2-) has same entropy as (600+ , 400-)

20

Page 21: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

21

Estimating Label Uncertainty (LU)

Observe +’s and –’s and compute Pr{+|obs} and Pr{-|obs}

Label uncertainty = tail of beta distribution

SLU

0.50.0 1.0

Beta probability density function

Page 22: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

Label Uncertainty

p=0.7 5 labelers

(3+, 2-) Entropy ~ 0.97

22

Page 23: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

Label Uncertainty

p=0.7 10 labelers

(7+, 3-) Entropy ~ 0.88

23

Page 24: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

Label Uncertainty

p=0.7 20 labelers

(14+, 6-) Entropy ~ 0.88

24

Page 25: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

Comparison

25

0.60.650.7

0.750.8

0.850.9

0.951

0 400 800 1200 1600 2000Number of labels (waveform, p=0.6)

Labe

ling

qual

ity

UNF MULU LMU

Label Uncertainty

Uniform, round robin

Page 26: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

26

Model Uncertainty (MU)

However, we do not have only labelers

A classifier can also give us labels!

Model uncertainty: get more labels for ambiguous/difficult examples

Intuitively: make sure that difficult cases are correct

+ ++

++ ++

+

+ ++

+

+ ++

++ ++

+

- - - -

- - - -- -

- -

- - - -

- - - -- - - -- - - -

- - - -

?

??

Page 27: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

27

Label + Model Uncertainty

Label and model uncertainty (LMU): avoid examples where either strategy is certain

MULULMU SSS

Page 28: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

Comparison

28

0.60.650.7

0.750.8

0.850.9

0.951

0 400 800 1200 1600 2000Number of labels (waveform, p=0.6)

Labe

ling

qual

ity

UNF MULU LMU

Label Uncertainty

Uniform, round robin

Label + Model Uncertainty

Model Uncertainty alone also improves

quality

Page 29: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

29

Classification Improvement

60

65

70

75

80

85

0 400 800 1200 1600 2000Number of labels (spambase, p=0.6)

Acc

urac

y

UNF MULU LMU

Page 30: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

30

Conclusions

Gathering multiple labels from noisy users is a useful strategy

Under high noise, almost always better than single-labeling

Selectively labeling using label and model uncertainty is more effective

Page 31: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

31

More Work to Do

Estimating the labeling quality of each labeler

Increased compensation vs. labeler quality

Example-conditional quality issues (some examples more difficult than others)

Multiple “real” labels

Hybrid labeling strategies using “learning-curve gradient”

Page 32: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

Other Projects

SQoUT projectStructured Querying over Unstructured Texthttp://sqout.stern.nyu.edu

Faceted InterfacesEconoMining project

The Economic Value of User Generated Contenthttp://economining.stern.nyu.edu

32

Page 33: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

33

SQoUT: Structured Querying over Unstructured Text

Information extraction applications extract structured relations from unstructured text

May 19 1995, Atlanta -- The Centers for Disease Control and Prevention, which is in the front line of the world's response to the deadly Ebola epidemic in Zaire , is finding itself hard pressed to cope with the crisis…

Date Disease Name Location

Jan. 1995 Malaria Ethiopia

July 1995 Mad Cow Disease U.K.

Feb. 1995 Pneumonia U.S.

May 1995 Ebola Zaire

Information Extraction System

(e.g., NYU’s Proteus)

Disease Outbreaks in The New York Times

Page 34: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

34

SQoUT: The QuestionsOutput Tokens

…Extraction

System(s)

Text Databases

3. Extract output tuples2. Process documents1. Retrieve documents from database/web/archive

Questions: 1.How to we retrieve the documents?2.How to configure the extraction systems?3.What is the execution time? 4.What is the output quality?

SIGMOD’06, TODS’07, + in progress

Page 35: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

EconoMining Project

Show me the Money!

Applications (in increasing order of difficulty)

Buyer feedback and seller pricing power in online marketplaces (ACL 2007)

Product reviews and product sales (KDD 2007)

Importance of reviewers based on economic impact (ICEC 2007)

Hotel ranking based on “bang for the buck” (WebDB 2008)

Political news (MSM, blogs), prediction markets, and news importance

Basic Idea

Opinion mining an important application of information extraction

Opinions of users are reflected in some economic variable (price, sales)

Page 36: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

Some Indicative Dollar ValuesPositive Negative

Natural method for extracting sentiment strength and polarity

good packaging -$0.56

Naturally captures the pragmatic meaning within the given context

captures misspellings as well

Positive? Negative ?

Page 37: Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost Panos Ipeirotis Stern School of Business New York University.

Thanks!

Q & A?