Archival Data for Page Protection: Another Missing Dimension of Wikipedia Research

  • Benjamin Mako Hill (Creator)
  • Aaron D. Shaw (Creator)

Dataset

Description

This dataset contains data and software for the following paper: Hill, Benjamin Mako and Shaw, Aaron. (2015) “Page Protection: Another Missing Dimension of Wikipedia Research.” In <em>Proceedings of the 11th International Symposium on Open Collaboration (OpenSym 2015)</em>. ACM Press. doi: 10.1145/2788993.2789846 This is an archival version of the data and software released with the paper. All of these data were (and, at the time of writing, continue to be) hosted at: https://communitydata.cc/wiki-proetection/ Page protection is a feature of MediaWiki software that allows administrators to restrict contributions to particular pages. For example, a page can be “protected” so that only administrators or logged-in editors with a history of good editing can edit, move, or create it. Protection might involve “full protection” where a page can only be edited by administrators (i.e., “sysops”) or “semi-protection” where a page can only be edited by accounts with a history of good edits (i.e., “autoconfirmed” users). Although largely hidden, page protection profoundly shapes activity on the site. For example, page protection is an important tool used to manage access and participation in situations where vandalism or interpersonal conflict can threaten to undermine content quality. While protection affects only a small portion of pages in English Wikipedia, many of the most highly viewed pages are protected. For example, the “Main Page” in English Wikipedia has been protected since February, 2006 and all Featured Articles are protected at the time they appear on the site’s main page. Millions of viewers may never edit Wikipedia because they never see an edit button. Despite it's widespread and influential nature, very little quantitative research on Wikipedia has taken page protection into account systematically. This page contains software and data to help Wikipedia researchers do exactly this in their work. Because a page's protection status can change over time, the snapshots of page protection data stored by Wikimedia and published by Wikimedia Foundation in as dumps is incomplete. As a result, taking protection into account involves looking at several different sources of data. Much more detail can be found in our paper Page Protection: Another Missing Dimension of Wikipedia Research. If you use this software or these data, we would appreciate if you cite the paper.
Date made availableDec 11 2016
PublisherHarvard Dataverse

Cite this