Quality and collaboration in Wikidata

48
QUALITY AND COLLABORATION IN WIKIDATA Elena Simperl and Alessandro Piscopo University of Southampton, UK @esimperl

Transcript of Quality and collaboration in Wikidata

Page 1: Quality and collaboration in Wikidata

QUALITY AND COLLABORATION

IN WIKIDATA

Elena Simperl and

Alessandro Piscopo

University of Southampton, UK

@esimperl

Page 2: Quality and collaboration in Wikidata

OVERVIEW

Wikidata is a critical AI asset in many applications

Recent project of Wikimedia (2012), edited collaboratively

Our research assesses the quality of Wikidata and the link between community processes and quality

Page 3: Quality and collaboration in Wikidata

WHAT IS WIKIDATA

Page 4: Quality and collaboration in Wikidata

BASIC FACTS

Collaborative knowledge graph

100k registered users, 35M items

Open licence

RDF exports, connected to Linked Open Data Cloud

Page 5: Quality and collaboration in Wikidata

THE KNOWLEDGE GRAPHSTATEMENTS, ITEMS, PROPERTIES

Item identifiers start with a Q, property identifiers

start with a P

5

Q84

London

Q334155

Sadiq Khan

P6

head of government

Page 6: Quality and collaboration in Wikidata

THE KNOWLEDGE GRAPHITEMS CAN BE CLASSES, ENTITIES, VALUES

6

Q7259Ada Lovelace

Q84London

Q334155Sadiq Khan

P6

head of government

Q727Amsterdam

Q515city

Q6581097male

Q59360Labour party

Q145United Kingdom

Page 7: Quality and collaboration in Wikidata

THE KNOWLEDGE GRAPHADDING CONTEXT TO STATEMENTS

Statements may include context Qualifiers (optional)

References (required)

Two types of references Internal, linking to another item

External, linking to webpage

7

Q84London

Q334155Sadiq Khan

P6head

of government

9 May 2016

https://www.london.gov.uk/...

Page 8: Quality and collaboration in Wikidata

THE KNOWLEDGE GRAPHCO-EDITED BY BOTS AND HUMANS

Human editors can register or work anonymously

Bots created by community for routine tasks

Page 9: Quality and collaboration in Wikidata

OUR WORK

Influence of community make-up on outcomes

Effects of editing practice on outcomes

Data quality, as a function of its provenance

Page 10: Quality and collaboration in Wikidata

THE RIGHT MIX OF USERS

Piscopo, A., Phethean, C., & Simperl, E. (2017) What

Makes a Good Collaborative Knowledge Graph:

Group Composition and Quality in Wikidata.

International Conference on Social Informatics, 305-

322, Springer.

Page 11: Quality and collaboration in Wikidata

BACKGROUND

Wikidata editors have varied tenure and interests

Group composition impacts outcomes

Diversity can multiple effects

Moderate tenure diversity increases outcome quality

Interest diversity leads to increased group productivity

Chen, J., Ren, Y., Riedl, J.: The effects of diversity on group productivity and member withdrawal in online volunteer groups. In: Proceedings of the 28th international

conference on human factors in computing systems - CHI ’10. p. 821. ACM Press, New York, USA (2010)

Page 12: Quality and collaboration in Wikidata

OUR STUDY

Analysed the edit history of itemsUsed corpus of 5000 items, whose quality has been manually assessed (5 levels)*

Edit history focused on community make-up

Community is defined as set of editors of item

Considered features from group diversity literature and Wikidata-specific aspects

*https://www.wikidata.org/wiki/Wikidata:Item_quality

Page 13: Quality and collaboration in Wikidata

RESEARCH HYPOTHESES

Activity Outcome

H1 Bots edits Item quality

H2 Bot-human interaction Item quality

H3 Anonymous edits Item quality

H4 Tenure diversity Item quality

H5 Interest diversity Item quality

Page 14: Quality and collaboration in Wikidata

DATA AND METHODS

Ordinal regression analysis, four models were trained

Dependent variable: 5000 labelled Wikidata items

Independent variables

Proportion of bot edits

Bot human edit proportion

Proportion of anonymous edits

Tenure diversity: Coefficient of variation

Interest diversity: User editing matrix

Control variables: group size, item age

Page 15: Quality and collaboration in Wikidata

RESULTSALL HYPOTHESES SUPPORTED

H1

H2

H3 H4

H5

Page 16: Quality and collaboration in Wikidata

LESSONS LEARNED

The more is not always the merrier

01Bot edits are key for quality, but bots and humans are better

02Diversity matters

03

Page 17: Quality and collaboration in Wikidata

IMPLICATIONS

Encourage registration

01Identify further areas for bot editing

02Design effective human-bot workflows

03Suggest items to edit based on tenure and interests

04

Page 18: Quality and collaboration in Wikidata

LIMITATIONS AND FUTURE WORK

▪ Measures of quality over time required

▪ Sample vs Wikidata (most items C or lower)

▪ Other group features (e.g., coordination) not

considered

▪ No distinction between editing activities (e.g.,

schema vs instances, topics etc.)

▪ Different metrics of interest (topics, type of

activity)

18

Page 19: Quality and collaboration in Wikidata

THE DATA IS AS GOOD AS ITS REFERENCES

Piscopo, A., Kaffee, L. A., Phethean, C., & Simperl, E.

(2017). Provenance Information in a Collaborative

Knowledge Graph: an Evaluation of Wikidata External

References. International Semantic Web Conference,

542-558, Springer.

19

Page 20: Quality and collaboration in Wikidata

PROVENANCE IN WIKIDATA

Statements may include context Qualifiers (optional)

References (required)

Two types of references Internal, linking to another item

External, linking to webpage

Q84London

Q334155Sadiq Khan

P6head

of government

9 May 2016

https://www.london.gov.uk/...

Page 21: Quality and collaboration in Wikidata

THE ROLE OF PROVENANCE

Wikidata aims to become a hub of references

Data provenance increases trust in Wikidata

Lack of provenance hinders data reuse

Quality of references is yet unknown

Hartig, O. (2009). Provenance Information in the Web of Data. LDOW, 538.

Page 22: Quality and collaboration in Wikidata

OUR STUDY

Approach to evaluate quality of external references in Wikidata

Quality is defined by the Wikidata verifiability policy Relevant: support the statement they are attached to

Authoritative: trustworthy, up-to-date, and free of bias for supporting a particular statement

Large-scale (the whole of Wikidata)

Bot vs. human-contributed references

Page 23: Quality and collaboration in Wikidata

RESEARCH QUESTIONS

RQ1 Are Wikidata external references relevant?

RQ2 Are Wikidata external references authoritative?

▪I.e., do they match the author and publisher types from the Wikidata policy?

RQ3 Can we automatically detect non-relevant and non-authoritative references?

Page 24: Quality and collaboration in Wikidata

METHODSTWO STAGE MIXED APPROACH

1. Microtask crowdsourcing

▪Evaluate relevance & authoritativeness of a reference sample

▪Create training set for machine learning model

2. Machine learning

▪Large-scale reference quality prediction

RQ1 RQ2

RQ3

Page 25: Quality and collaboration in Wikidata

STAGE 1: MICROTASK CROWDSOURCING

▪3 tasks on Crowdflower

▪5 workers/task, majority voting

▪Test questions to select workers

25

Feature Microtask Description

Relevance T1 Does the reference support the statement?

Authoritativeness

T2 Choose author type from list

T3.A Choose publisher type from list

T3.B Verify publisher type, then choose sub-type from list

RQ1

RQ2

Page 26: Quality and collaboration in Wikidata

STAGE 2: MACHINE LEARNING

Compared three algorithms Naïve Bayes, Random Forest, SVM

Features based on [Lehmann et al., 2012 & Potthast et al. 2008]

Baseline: item labels matching (relevance); deprecated domains list (authoritativeness)

RQ3

Features

URL reference uses Subject parent class

Source HTTP code Property parent class

Statement item vector Object parent class

Statement object vector Author type

Author activity Author activity on references

Page 27: Quality and collaboration in Wikidata

DATA

1.6M external references (6% of total) 1.4M from two sources (protein KBs)

83,215 English-language references Sample of 2586 (99% conf., 2.5% m. of error)

885 assessed automatically, e.g., links not working or csv files

Page 28: Quality and collaboration in Wikidata

RESULTS: CROWDSOURCINGCROWDSOURCING WORKS

▪Trusted workers: >80% accuracy

▪95% of responses from T3.A confirmed in T3.B

Task No. of microtasks Total workers Trusted workers Workers’ accuracy Fleiss’ k

T1 1701 references 457 218 75% 0.335

T2 1178 links 749 322 75% 0.534

T3.A 335 web domains 322 60 66% 0.435

T3.B 335 web domains 239 116 68% 0.391

Page 29: Quality and collaboration in Wikidata

RESULTS: CROWDSOURCINGMAJORITY OF REFERENCES ARE HIGH QUALITY

2586 references evaluated

Found 1674 valid references from 345 domains

Broken URLs deemed not relevant and not authoritative

RQ1

RQ2

Page 30: Quality and collaboration in Wikidata

RESULTS: CROWDSOURCINGHUMANS ARE BETTER AT EDITING REFERENCES

RQ1

RQ2

Page 31: Quality and collaboration in Wikidata

RESULTS: CROWDSOURCINGDATA FROM GOVT. AND ACADEMIA

Most common author type (T2)

Organisation (78%)

Most common publisher types (T3)

Governmental agencies (37%)

Academic organisations (24%)

RQ2

Page 32: Quality and collaboration in Wikidata

RESULTS: MACHINE LEARNINGRANDOM FORESTS PERFORM BEST

F1 MCC

Relevance

Baseline 0.84 0.68

Naïve Bayes 0.90 0.86

Random Forest 0.92 0.89

SVM 0.91 0.87

Authoritativeness

Baseline 0.53 0.16

Naïve Bayes 0.86 0.78

Random Forest 0.89 0.83

SVM 0.89 0.79

RQ3

Page 33: Quality and collaboration in Wikidata

LESSONS LEARNED

Crowdsourcing+ML works!

Many external sources are high quality

Bad references mainly non-working links, continuous control required

Lack of diversity in bot-added sources

Humans and bots are good at different things

Page 34: Quality and collaboration in Wikidata

LIMITATIONS AND FUTURE WORK

Studies with non-English sources

New approach for internal references

Deployment in Wikidata, including changes inediting behaviour

Page 35: Quality and collaboration in Wikidata

THE COST OF FREEDOM: ON THE ROLE OF PROPERTY CONSTRAINTS IN WIKIDATA

35

Page 36: Quality and collaboration in Wikidata

BACKGROUND

Wikidata is built by the community, from scratch

Editors are free to carry out any kind of edit

There is tension between editing freedom and quality of the modelling

Property constraints have been introduced at a later stage

Currently 18 constraints, but they are not enforced

36Hall, A., McRoberts, S., Thebault-Spieker, J., Lin, Y., Sen, S., Hecht, B., & Terveen, L. (2017, May). Freedom versus standardization: structured data generation in a peer

production community. In Proceedings of the 2017 CHI Conference on human fators in computing sytems (pp. 6352-6362). ACM.

Page 37: Quality and collaboration in Wikidata

OUR STUDY

Effects of property constraints onContent quality, i.e., increasing user awareness of property use

Diversity of expression

Editor behaviour, by increasing conflict level

Page 38: Quality and collaboration in Wikidata

▪Several claims can be expressed for a statement, thanks to qualifiers and references

38

Q84London

Q334155Sadiq Khan

P6

9 May 2016

https://www.london.gov.u

k/…

The cost of freedom: Claims

Q180589Boris Johnson

4 May 2008

https://www.london.gov.u

k/…

Page 39: Quality and collaboration in Wikidata

RESEARCH HYPOTHESES

Activity Outcome

H1 Property constraints Property perspicuity

H2 Property constraints Knowledge diversity

H3 Property constraints Level of conflict

Page 40: Quality and collaboration in Wikidata

METRICS

▪ Property perspicuity: V = Nviolations/Nclaims

▪ Knowledge diversity: KDscore = Nclaims/Nstatements

▪ Controversy metric:

▪ Conflicting edits

▪ Cscore = Nconfl.edits/Nedits (0> Cscore>>1)

40

Page 41: Quality and collaboration in Wikidata

METHODS

H1: Linear trend analysis of Cviolations

H2 and H3: Lagged, multiple regression models to predict changes between Tn & Tn–1in KDscore and Cscore

Page 42: Quality and collaboration in Wikidata

RESULTS

H1 was supported, but limited to some constraints

12 constraints out of 18 showed significant variations along the time frame observed

Constraint with largest variation was type (i.e., property domain)

Page 43: Quality and collaboration in Wikidata

RESULTS

H2 was rejected, but more property constraints at the

beginning of a time frame lead to decreased knowledge

diversity

Page 44: Quality and collaboration in Wikidata

RESULTS

H3 was rejected, constraints lead to fewer conflicts

Page 45: Quality and collaboration in Wikidata

LIMITATIONS

Wikidata still in early state of development

Metrics need further refinement

Changes were made to constraints after our

analysis, which could produce new effects

Page 46: Quality and collaboration in Wikidata

LESSONS LEARNED

Editors seem to understand meaning of property constraints

Low level of knowledge diversity and conflict overall

Non-enforcement of constraints seems to have only limited effect on community dynamics

Effects of when and how constraints are introduced not explored yet

46

Page 47: Quality and collaboration in Wikidata

CONCLUSIONS

47

Page 48: Quality and collaboration in Wikidata

SUMMARY OF FINDINGS

Collaboration between human and bots is important

Tools needed to identify tasks for bots and continuously study their effects on outcomes and community

References are high quality, though biases exist in terms of choice of sources

Wikidata’s approach to knowledge engineering questions existing theoretical and empirical literature