Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)

18
CrowdTruth What is disagreement & why does it make more sense than agreement? Lora Aroyo & Chris Welty

description

CrowdTruth: What is disagreement & why does it make more sense than agreement? http://crowdtruth.org

Transcript of Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)

Page 1: Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)

CrowdTruth What is disagreement & why

does it make more sense than agreement?

Lora Aroyo & Chris Welty

Page 2: Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)

problem

cognitive computing systems need

annotated data for training, testing, evaluation

Page 3: Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)

solution

human annotation through

crowdsourcing augmented with machine processing

Page 4: Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)

What's wrong with the gold standard?

●  algorithmic performance is measured on test sets vetted by human experts → never perfectly correct

●  gold standards are created assuming that for each annotated instance there is a single right answer → doesn’t account for alternative interpretations & clarity

●  gold standard quality is measured in inter-annotator agreement → what happens if disagreeing annotators are both right?

The fallacy of the “one truth” assumption that pervades computational semantics

Page 5: Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)
Page 6: Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)

One Truth? Does each sentence express the TREAT relation?

ANTIBIOTICS are the first line treatment for indications of TYPHUS. à agreement 95% Patients with TYPHUS who were given ANTIBIOTICS exhibited several side-effects. à agreement 80% With ANTIBIOTICS in short supply, DDT was used during World War II to control the insect vectors of TYPHUS. à agreement 50%

Page 7: Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)

One Truth? Does each sentence express the TREAT relation?

ANTIBIOTICS are the first line treatment for indications of TYPHUS. à agreement 95% Patients with TYPHUS who were given ANTIBIOTICS exhibited several side-effects. à agreement 80% With ANTIBIOTICS in short supply, DDT was used during World War II to control the insect vectors of TYPHUS. à agreement 50%

Disagreement can reflect lack of clarity in a sentence

Page 8: Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)

What is the relation between the highlighted terms?

GADOLINIUM agents are useful for patients with renal impairment, but in patients with severe renal failure requiring dialysis it presents a risk of nephrogenic systemic FIBROSIS.

One Interpretation?

Disagreement can indicate alternative interpretations of relations

cause or side effect ?

Page 9: Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)

Does each sentence express the TREAT relation?

ANTIBIOTICS are the first line treatment for indications of TYPHUS.

QUININE is not a reliable cure for MALARIA.

Disagreement can indicate low quality workers

One Quality?

Page 10: Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)

Why Disagreement Happens

Page 11: Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)

Why Disagreement Happens

Page 12: Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)

Why Disagreement Happens

Page 13: Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)

Why Disagreement Happens

Page 14: Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)

Why Disagreement Happens

Disagreement is usefull

Page 15: Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)

CrowdTruth Annotator disagreement is signal, not noise.

It is indicative of the variation in human

semantic interpretation of signs

It can indicate ambiguity, vagueness, similarity, over-generality, etc,

as well as quality

Page 16: Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)

CrowdTruth is the response to the current reality of

cognitive computing systems - driven by data analytics & elevated by interpretation.

it supports the need to bring the human semantics, representing the dynamics of opinions and perspectives, into machine

readable form

Page 17: Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)

CrowdTruth

captures and represents human semantics & thus helps extending the

capabilities of cognitive computing systems

Page 18: Crowdsourcing & Semantic Web: Dagstuhl 2014 (Presentation Lora)

crowdtruth.org