Copyright 2010 Digital Enterprise Research Institute. All rights reserved.
Digital Enterprise Research Institute www.deri.ie
Understanding and Improving Wikipedia Article Discussion
Spaces
Jodi Schneider, Alexandre Passant, John Breslin
ACM SAC2011-03-24Taichung, Taiwan
Digital Enterprise Research Institute www.deri.ie
Wikipedia editors are leaving faster than they can be replaced
2
Felipe Ortega via http://www.businessinsider.com/chart-of-the-day-wikipedia-editors-2009-11
2 of 27
Digital Enterprise Research Institute www.deri.ie
How do we turn readers into editors?
Ensure people know they can edit Make editing easier Help learn how things work by reading
discussions! “Reading Talk pages – the behind-the-scenes
discussions about Wikipedia articles – signals a transition towards more active forms of participation.” – Antin & Cheshire, CSCW 2010
Make more edits “stick” Understand what kinds of contributions are accepted
– Provide support for creating good arguments– Avoid need for reverts
3 of 27
Digital Enterprise Research Institute www.deri.ie
Wikipedia Discussion Space: “Talk page”
4 of 27
Digital Enterprise Research Institute www.deri.ie
Talk pages need semantics
Lots of conversations Viégas: “the fastest growing areas
of Wikipedia are devoted to coordination and organization”
When are people agreeing/disagreeing? Not well understood!
Very little study of Talk pages Largest study: 60 pages, 2 types.
Discovered: Featured Articles have 10x discussion!
Immense variation between pages
5 of 27
Data from Stvilia
Digital Enterprise Research Institute www.deri.ie
Social Semantic Web
6 of 27
Digital Enterprise Research Institute www.deri.ie
My Research Questions
1. What do Wikipedians do on Talk pages?2. What kind of arguments happen on Talk
pages?3. Can we add structure to make pages “fit” how
editors and readers use them?
7 of 27
Digital Enterprise Research Institute www.deri.ie
Three ways of understanding Talk pages
1. Interviews with editors and administratorsWhat do Wikipedians do on Talk pages?
2. Hand content analysis of 100 Talk pagesWhat kind of arguments happen on Talk pages?
3. Developing & using a semantic modelCan we add structure to make pages “fit” how editors
and readers use them?
8 of 27
Digital Enterprise Research Institute www.deri.ie
1. Interviews
Administrators Frequently monitor conversations Know + meet co-editors Make community-related edits such as adding infoboxes More likely to move/rename articles and Talk pages
Editors Mostly read Talk pages “Get the scoop”—what’s controversial? More details? More likely to read older conversations May learn policy and procedures
9 of 27
Digital Enterprise Research Institute www.deri.ie
2. Content Analysis
100 Talk pages 5 categories of pages
Most editors (of the article) Most visits (to the article) Controversial Featured Articles Random
15 classifications
10 of 27
Digital Enterprise Research Institute www.deri.ie
Classification Example
Reference to...
Sources outside the wiki ... Not sure where to put it but I’ll leave it here as somebody might find it useful
Reverts, removed material, or controversial edits
I noticed some people edit the page into what it will be in 10 minutes but someone is reverting it...just let it be
Edits the discussant made Added the About.com review since the review was part of the reception section.
Requests for...
Help with another article, portal, etc.
This is just to invite attention to the page Facebook statistics just created…
11 of 27
Digital Enterprise Research Institute www.deri.ie
The 15 Classifications
References to…1. Vandalism2. Guidelines and policies3. Sources outside Wikipedia4. Reverts, removed
material, or controversial edits
5. Edits the discussant made6. Internal Wikipedia
resources
Requests for…7. Editing coordination8. Information9. Help with another article10. Peer review
Etc.11. Off-topic remarks12. Polls13. Information boxes14. Images15. Other
12 of 3912 of 27
Digital Enterprise Research Institute www.deri.ie
Digital Enterprise Research Institute www.deri.ie
3a. Developing a content-based semantic model
Represent article structure Reuse existing ontologies (FOAF, SIOC)
Represent content (based on the content analysis) Winnow the 15 classifications: relevance & plausibility
– “Relevant” for querying and retrieving information– “Plausible” a person would mark their own comment
“Off topic”
“Request for help”
14 of 27
Digital Enterprise Research Institute www.deri.ie
Represent thread structure
15 of 27
sioc:Thread
sioc:Post
Digital Enterprise Research Institute www.deri.ie
sioc: links_tohttp://en.wikipedia.org/wiki/Template:WikiProject_Computing
Express relationships
16 of 27
Digital Enterprise Research Institute www.deri.ie
Reuse SIOC & FOAF for structure
Article sioct:WikiArticle
Link the article to the Talk page sioc:has_discussion
Discussion threads sioc:Thread
Individual comments sioc:Post
Commenter foaf:Person / sioc:UserAccount
17 of 27
Digital Enterprise Research Institute www.deri.ie
Our SIOC WikiTalk ontology
WikiDiscussionItem ReferenceItem
– ReferenceToEdit – ReferenceToGuidelinesOrPolicies – ReferenceToInternalResources – ReferenceToRevertsOrControversialOrRemovedMaterial– ReferenceToVandalism
RequestItem– RequestEditingCoordination – RequestHelpElsewhere – RequestInfo – RequestPeer-review
http://rdfs.org/sioc/wikitalk
18 of 27
Digital Enterprise Research Institute www.deri.ie
3b. Using our semantic model
Hand markup Wikipedia Talk pages with RDFa Query to find comments meeting specified
criteria JavaScript and SPARQL
Formative evaluations Browsing talk pages, with & without highlighting, to
identify particular types of comments
19 of 27
<p about="#Thread2Post1" typeof="siocwt:RequestEditingCoordination"rel="sioc:has_container" href="#Rule_Interchange_Format"></p> <div about="#Thread2Post1" rel="sioc:has_creator"href="http://en.wikipedia.org/wiki/User:Nloth">
<div about="#Thread2Post1" rel="sioc:last_activity_date" content="20091116T0432-0000" datatype="xsd:dateTime">
<p>I'd support having <a href="http://en.wikipedia.org/wiki/Rule_Interchange_Format">Rule Interchange Format</a> merged into this article … <a href=http://en.wikipedia.org/wiki/User:Nloth title="User:Nloth">Nloth</a> (<a href="http://en.wikipedia.org/wiki/User_talk:Nloth" title="User talk:Nloth">talk</a>) 04:32, 16 November 2009 (UTC)</p></div> </div>
Digital Enterprise Research Institute www.deri.ie
Using the markup: JavaScript bookmarklets
Highlight posts based on the ontology class – e.g. ReferenceToEdit
21 of 27
Digital Enterprise Research Institute www.deri.ie
Retrieve RequestInfo posts in WikiProject Computing
We retrieve the “RequestInfo” posts with SPARQL:
SELECT ?commment ?page WHERE {
?page sioc:links_to <http://en.wikipedia.org/ wiki/Template:WikiProject_Computing > .
?comment sioc:has_container ?page ; a sioc:Post ; a siocwt:RequestInfo .
}
22 of 27
Digital Enterprise Research Institute www.deri.ie
Summary
We can increase the effectiveness of Wikipedia Talk pages by understanding how they are used
We add semantic structure to Wikipedia Talk pages which can be used to extract socially useful info
Social Semantic Web expertise can benefit Wikipedia
23 of 27
Digital Enterprise Research Institute www.deri.ie
Thank You!
Questions & Comments?
Contact: [email protected]
Thanks to SAC-STAP for travel support and to Science Foundation Ireland for Ph.D. fundingGrant No. SFI/09/CE/I1380 (Líon2)!
24 of 27
Digital Enterprise Research Institute www.deri.ie
Our Wikipedia-Related Research
“Understanding and Improving Wikipedia Article Discussion Spaces.” In SAC 2011 (Web Track), TaiChung, Taiwan, March 21-25, 2011.
“Enhancing MediaWiki Talk pages with Semantics for Better Coordination - A Proposal.” In The Fifth Workshop on Semantic Wikis: Linking Data and People Workshop at 7th Extended Semantic Web Conference (ESWC), Crete, Greece, May 31, 2010.
“A Content Analysis: How Wikipedia Talk Pages Are Used.” In WebSci2010, Web Science Conference. Raleigh, NC, April 26 & 27 2010.
25 of 27
Digital Enterprise Research Institute www.deri.ie
References
Antin, J., & Cheshire, C. (2010). Readers are not free-riders: Reading as a form of participation on Wikipedia. CSCW 2010. doi: 10.1145/1718918.1718942
Stvilia, Twidale, Smith & Gasser, "Information Quality Work Organization in Wikipedia," JASIST 2008. doi: 10.1002/asi.2081
Viégas, Wattenberg, Kriss & Ham, "Talk Before You Type: Coordination in Wikipedia," HICSS 2007. doi: 10.1109/HICSS.2007.511
26 of 27
Digital Enterprise Research Institute www.deri.ie
Further image credits
Felipe Ortega’s dissertation research Wikipedia logo Talk pages screenshots from
http://en.wikipedia.org/Talk: {articlename}
27 of 27
Top Related