Reading Preference and Behavior on Wikipedia
-
Upload
janette-lehmann -
Category
Data & Analytics
-
view
102 -
download
2
Transcript of Reading Preference and Behavior on Wikipedia
![Page 1: Reading Preference and Behavior on Wikipedia](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c1ad66bb61ebc80a8b47fe/html5/thumbnails/1.jpg)
Reading Preference and Behavior on Wikipedia
Janette Lehmann, Claudia Müller-Birn, David Laniado,
Mounia Lalmas, Andreas Kaltenbrunner
photo credit: marissa, CC BY 2.0
![Page 2: Reading Preference and Behavior on Wikipedia](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c1ad66bb61ebc80a8b47fe/html5/thumbnails/2.jpg)
![Page 3: Reading Preference and Behavior on Wikipedia](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c1ad66bb61ebc80a8b47fe/html5/thumbnails/3.jpg)
![Page 4: Reading Preference and Behavior on Wikipedia](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c1ad66bb61ebc80a8b47fe/html5/thumbnails/4.jpg)
• Second-class members of an
online community (Preece et al. 2004)
• “Lurkers” or “free-riders” (e.g., Nonnecke, 2000, Nonnecke, 2004)
• More resource-taking than
value-adding (Kollock, 1990)
• Only valuable when they
become active contributors (Preece et al. 2004)
![Page 5: Reading Preference and Behavior on Wikipedia](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c1ad66bb61ebc80a8b47fe/html5/thumbnails/5.jpg)
Why is it useful to study readers?
• Improving the article quality evaluation – Defining new metrics to measure article quality (e.g., reading time)
– Interweaving explicit (AFT) and implicit feedback
• Improving the interface design
• Giving authors positive feedback – Authors feel that their work is more valuable when many users read the article
• Improving the reading experience – Users … having a good reading experience
… returning more often … becoming contributors
![Page 6: Reading Preference and Behavior on Wikipedia](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c1ad66bb61ebc80a8b47fe/html5/thumbnails/6.jpg)
(1) We studied users’ reading preferences
- what they read -
(2) We analyzed users’ reading behaviors
- how they read -
![Page 7: Reading Preference and Behavior on Wikipedia](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c1ad66bb61ebc80a8b47fe/html5/thumbnails/7.jpg)
![Page 8: Reading Preference and Behavior on Wikipedia](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c1ad66bb61ebc80a8b47fe/html5/thumbnails/8.jpg)
Preference matrix of biography articles
Editing preference of an article
Article length at the end of our data period
Reading preference of an article
Median monthly article popularity
measured by the number of page views
• 74.1% of the articles have an average
article length or popularity.
• We focus on the remaining 25.9% - the
extreme cases.
Data set
Page view data from Wikipedia
1M biography articles
460M page views
Sep 2011 – Sep 2012
![Page 9: Reading Preference and Behavior on Wikipedia](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c1ad66bb61ebc80a8b47fe/html5/thumbnails/9.jpg)
Preference matrix of biography articles
For 9.8% (group I) and 7.9% (group III) of the articles
editing and reading activity is high.
![Page 10: Reading Preference and Behavior on Wikipedia](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c1ad66bb61ebc80a8b47fe/html5/thumbnails/10.jpg)
Preference matrix of biography articles
For 4.0% (group II) of the articles
editing activity is high, but reading activity is low.
![Page 11: Reading Preference and Behavior on Wikipedia](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c1ad66bb61ebc80a8b47fe/html5/thumbnails/11.jpg)
Preference matrix of biography articles
For 4.2% (group IV) of the articles
editing activity is low, but reading activity is high.
![Page 12: Reading Preference and Behavior on Wikipedia](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c1ad66bb61ebc80a8b47fe/html5/thumbnails/12.jpg)
Reading preferences
• Dominance of entertainment-related topics on Wikipedia
• There are articles where editing and reading preferences do not align – Being aware of these divergences can help editors
making informed decisions about which articles to focus next.
– Thereby also temporal changes of popularity should be taken into account.
![Page 13: Reading Preference and Behavior on Wikipedia](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c1ad66bb61ebc80a8b47fe/html5/thumbnails/13.jpg)
(1) We studied users’ reading preferences
- what they read -
(2) We analyzed users’ reading behaviors
- how they read -
✔
![Page 14: Reading Preference and Behavior on Wikipedia](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c1ad66bb61ebc80a8b47fe/html5/thumbnails/14.jpg)
Reading session
Session metrics
article views: 3
reading time: 4.3min
session articles: 5
0.5min 1.8min 2min
session
starts
session
ends
time
Data set
Browsing data from the Yahoo toolbar
288K biography articles
387K users
4.5M page views
Sep 2011 – Sep 2012
![Page 15: Reading Preference and Behavior on Wikipedia](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c1ad66bb61ebc80a8b47fe/html5/thumbnails/15.jpg)
Behavior vectors of an article
Behavior vector 2
Behavior vector 3 Behavior vector 1
Behavior vector
• Average reading behavior on an article described by the three session metrics
and the popularity metric
• 9.7K articles; 50K behavior vectors
Reading pattern
• Clustering of the behavior vectors using k-means
• 4 main reading pattern (clusters) were identified
![Page 16: Reading Preference and Behavior on Wikipedia](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c1ad66bb61ebc80a8b47fe/html5/thumbnails/16.jpg)
Reading pattern
Focus
• Expected encyclopedic reading behavior
• Users spend a lot of time reading the article (high ReadingTime), but access very few other articles (low
value of SessionArticles) within the session
- / + little below/above average
-- / ++ far below/above average
![Page 17: Reading Preference and Behavior on Wikipedia](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c1ad66bb61ebc80a8b47fe/html5/thumbnails/17.jpg)
Reading pattern
Trending
• Articles related to trending topics (high Popularity)
• Users “quickly look up” for information about something that is currently trending or has recently
happened (average ReadingTime)
• Highest editing activity: Articles are long (38K), and edited frequently (20 edits)
- / + little below/above average
-- / ++ far below/above average
![Page 18: Reading Preference and Behavior on Wikipedia](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c1ad66bb61ebc80a8b47fe/html5/thumbnails/18.jpg)
Reading pattern
Exploration
• Users explore many articles around a topic (high value of SessionArticles)
• Thereby they return regularly to the focal article, using it as a kind of ‘navigation page’ (high value of
ArticleViews)
- / + little below/above average
-- / ++ far below/above average
![Page 19: Reading Preference and Behavior on Wikipedia](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c1ad66bb61ebc80a8b47fe/html5/thumbnails/19.jpg)
Reading pattern
- / + little below/above average
-- / ++ far below/above average
Passing
• Users read many articles related to a topic (high value of SessionArticles)
• Thereby users only pass through the focal article (low ReadingTime), and do not return to it (low
ArticleViews)
• Lowest editing activity: Articles are short (16K), and not edited frequently (8 edits)
![Page 20: Reading Preference and Behavior on Wikipedia](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c1ad66bb61ebc80a8b47fe/html5/thumbnails/20.jpg)
Reading pattern over time
Stability
• 30% of the articles are popular in a single-month
• 10% are popular over the whole 13-month period
• Almost all articles have one reading pattern half
of their life time
Transitions
• Transitions are temporary – articles belong to
one cluster, and move temporarily to another
cluster
• High reciprocity – similar number of transitions
in both directions
• “Focus” cluster is isolated - Articles in that
cluster are the most stable ones
• Strong connection between the “Passing”,
“Exploration”, and “Trending” clusters – many
articles adopt all three reading patterns
![Page 21: Reading Preference and Behavior on Wikipedia](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c1ad66bb61ebc80a8b47fe/html5/thumbnails/21.jpg)
Conclusions
Data on readers are available, but their potential has not being fully exploited.
They can support editors to make long-lasting decisions for their editorial work, and
might engage readers more to the Wikipedia.
The temporal nature of reading behavior should be taken into account.
photo credit: marissa, CC BY 2.0
![Page 22: Reading Preference and Behavior on Wikipedia](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c1ad66bb61ebc80a8b47fe/html5/thumbnails/22.jpg)
Future work
Extension of the study about reading behavior
Development/Extension of tools that support editors (e.g., SuggestBot)
photo credit: marissa, CC BY 2.0
![Page 23: Reading Preference and Behavior on Wikipedia](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c1ad66bb61ebc80a8b47fe/html5/thumbnails/23.jpg)
Thank you.
For more information:
http://janette-lehmann.de/docs/pub2014_ht.pdf
Check out the review by Piotr on Wikimedia Research Newsletter (vol 4, issue 7, July 2014)
![Page 24: Reading Preference and Behavior on Wikipedia](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c1ad66bb61ebc80a8b47fe/html5/thumbnails/24.jpg)
References
• C. Okoli, M. Mehdi, M. Mesgari, F. A. Nielsen, and A. Lanamäki. The People’s Encyclopedia Under
the Gaze of the Sages: A Systematic Review of Scholarly Research on Wikipedia. http://ssrn.com/
abstract=2021326, 2012.
• J. Preece, B. Nonnecke, and D. Andrews. The top five reasons for lurking: improving community
experiences for everyone. Comp. in Human Behavior, 20(2), 2004.
• B. Nonnecke and J. Preece. Lurker demographics: counting the silent. In Proc. CHI (2000).
• B. Nonnecke, J. Preece and D. Andrews. What lurkers and posters think of each other. In Proc.
HICSS (2004).
• P. Kollock. The economies of online cooperation: Gifts and public goods in cyberspace. In
Communities in Cyberspace, pages 220–239. Routledge, 1990.