Copyright 2011 Digital Enterprise Research Institute. All rights reserved.
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Deletion Discussions in Wikipedia*: Decision Factors and
Outcomes
Jodi Schneider, Alexandre Passant, & Stefan Decker
WikiSym 2012Linz, Austria
Wednesday 29th August 2012
1
*enWP
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Big questions about WP
Is crowdsourcing sustainable? Is content bias manageable? Does it matter who writes WP? How can newcomers be welcomed and
socialized?
2
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
… are related to Deletion
Is crowdsourcing sustainable? How do we maintain content through deletion?
Is content bias manageable? Are new articles needed? Are they welcomed?
Does it matter who writes WP? … or who makes deletion decisions?
How can newcomers be welcomed and socialized? Deletion threatens editor retention
– 1 in 3 editors begin by creating a new article– 7 times as likely to stay if their article is kept
Source: [[User:Mr.Z-man/newusers]] via [[Wikipedia:Wikipedia_Signpost/2011-04-04/Editor_retention]]
3
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Overall Goals
Understand outcomes of deletion discussions What are good outcomes for articles? ... for the community?
Provide support to various groups Readers/New Editors Debate Closers People Reading Archived Debates
4
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
This Study’s Research Questions
1. What factors contribute to the decision about whether to delete a given article?
2. When multiple factors are given, what is the relative importance of those factors?
3. What are the outcomes of deletion discussions, both for articles and for the community?
5
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Overview
Outcomes (RQ3) Data, Methods, Previous Research Factors (RQs 1&2) Future Work on Support (Demo)
6
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Articles: Good Outcomes
7
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
… Content Expansion
8
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Good Rationale
9
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Good Outcome?
10
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Community: Good Outcomes
Learning to argue effectively Becoming more detached from content Introducing new editors to community values Developing new editors’ editing skills
11
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Example: Good Community Outcomes
William Vickers (fiddler) 1 main author – their first article Nominated for deletion after 1 hour and 20 minutes Shaped during the process
12
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Changes During AfD
Article renamed to William Vickers manuscript Discography added 26 edits from this author
13
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Supporting the Editor
First article this editor created. Overall 11 articles later created by this editor. Creator made many more edits to this article.
26 edits, compared to 3-9 edits to his later articles.
14
15
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Overview
Outcomes (RQ3) Data, Methods, Previous Research Factors (RQs 1&2) Future Work on Support (Demo)
16
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Discussion-based Deletion
“Articles for Deletion” (AfD)
Most contentious Articulated decision-making 500+ deletion discussions/week ~12% of deletions Lam & Riedl. “Is Wikipedia growing a longer
tail?” GROUP ’09
17
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Dataset
Data Corpus: “Typical Day” 72 deletion discussions January 29, 2011
English Wikipedia only
18
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Methods
Deep analysis of a moderate-sized dataset
Representative sample Intensive manual analysis Annotation with multiple coders Descriptive statistics
19
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Previous Research
Shallow analysis of large datasets
Redacted content – West & Lee, “What Wikipedia deletes” WikiSym 2011
Vote sequencing – Taraborelli & Ciampaglia “Beyond notability” SASOW 2011
Decision quality – Lam, Karim & Riedl “The effects of group composition on
decision quality in a social production community”, GROUP 2010
Who participates, what & how much gets deleted– Priedhorsky, Chen, Lam, Panciera, Terveen, & Riedl. “Creating,
destroying, and restoring value in Wikipedia”, GROUP 2007– Geiger & Ford “Participation in Wikipedia’s article deletion
processes”, WikiSym 2011
20
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
From Reading to Editing
How can newcomers be welcomed and socialized? Deletion threatens editor retention
– 1 in 3 editors begin by creating a new article– 7 times as likely to stay if their article is kept
Source: [[User:Mr.Z-man/newusers]] via [[Wikipedia:Wikipedia_Signpost/2011-04-04/Editor_retention]]
21
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Instructions?
22
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Notabili-what?
22% of all deletions are speedy deleted for A7: No indication of importance
Geiger & Ford WikiSym 2011
23
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Reader’s View of Deletion
24
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Novices vs. Experts in deletion discussions
Worthwhile content that is poorly defended -> deleted
Need Wikipedia knowledge (procedural knowledge) Need content knowledge
25
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Articulate Values/Criteria
4 Factors in Deletion Discussions cover 91% of comments 70% of discussions
26
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Articulate Values/Criteria
4 Factors in Deletion Discussions cover 91% of comments 70% of discussions
The best way to avoid deletion is for readers to understand these criteria.
27
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Article Feedback
28
Factor Example (used to justify `keep')
Notability Anyone covered by another encyclopedic reference is considered notable enough for inclusion in Wikipedia.
Sources Basic information about this album at a minimum is certainly verifiable, it's a major label release, and a highly notable band.
Maintenance …this article is savable but at its current state, needs a lot of improvement.
Bias It is by no means spam (it does not promote the products).
Other I'm advocating a blanket "hangon" for all articles on newly- drafted players
Jodi Schneider, Alexandre Passant & Stefan DeckerDeletion Discussions in Wikipedia: Decision Factors and Outcomes
4 Factors (RQ1)
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Articulate Values/Criteria
4 Factors in Deletion Discussions cover 91% of comments 70% of discussions
The best way to avoid deletion is for readers to understand these 4 criteria: Notability Sources Maintenance Bias
30
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Factors in Context
31
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Relative importance (R2)
Notability trumped by other values Comprehensiveness > Notability (given
Sources) Keeping a (non-notable) Velvet Underground album
we shouldn’t mechanically apply notability guidelines in this instance, where it would “[punch a] hole in their otherwise comprehensive discography.”
Maintenance > Notability Deleting a notable topic due to maintenance
this is the rare case where notability is not the main argument in favor of deletion. It has been demonstrated that the subject is already covered in numerous other articles and that those articles do a much better, more thorough job of covering the topic.
32
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Issues
Discussions fail without comments Interactions with article creators
Contentious Learning opportunity
Conflicts around consensus values Notability
– Why just because it is a small team and not major does it not deserve it’s (sic) own page on here?
Reliable sources Policy development is separated from case
debates Frankly, the basis of my disagreement with you here is
that I don’t agree with the guideline.
33
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Future Work
Factor-based view of deletion Please give me feedback!
34
35
36
37
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Thanks!
[email protected]://jodischneider.com/jodi.html@jschneiderUser:Jodi.a.schneider
38
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Deletion Workflow
39
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Articles for Deletion (AfD)
40
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Friction with outside
41
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Novices don’t understand notability
Notability vs. real-world importance Emsworth Cricket Club is one of the oldest cricket clubs in the world,
and this really is worth a mention. Especially on a website, where pointless people … gets a mention.
Why just because it is a small team and not major does it not deserve it’s (sic) own page on here?
43
Top Related