![]() |
||
| |
||
Vol.
V, Issue 2, December 15, 2006 |
||
| |
||
Vandals, Administrators, and Sockpuppets, Oh My! An Ethnographic Study of Wikipedia’s Handling of Problem BehaviorMichael Lorenzen Wikipedia is a 21st Century phenomena which is forcing many to reconsider what is and what is not valid and authoritative online. Wikipedia is an online encyclopedia that anyone can edit. This creates many opportunities to expand knowledge but it also opens the project up to vandalism and abuse. Many writers have commented on this and determined that Wikipedia has a good defense against problematic behavior even if these same writers are unsure of the legitimacy of Wikipedia as a whole. Other writers have noted the need for identified authors for legitimacy to be attainable. This ethnographic study looks at a public system that Wikipedia uses to identify and correct problem behaviors from contributors. It concludes that Wikipedia does have a good system in place that can protect the integrity of articles in many instances. However, this study was limited in scope and was unable to determine if the system in place for abuse reporting is truly able to vouch for the status of Wikipedia as an authoritative resource. IntroductionThe way that people access information has changed dramatically over the last 20 years. The advent of the World Wide Web has reduced reliance on printed material at the same time it has empowered more voices to be heard. It has also raises issues relating to the validity of information. As much of the content on the Web goes through no editorial process, it can be difficult to assess whether information is accurate. Wikipedia is an online encyclopedia which has attempted to address this concern. It allows anyone to edit but at the same time it has a variety of layers of review which make it difficult (but not impossible) to vandalize with spam or inaccurate information. This paper will relate an ethnographic study that explored the culture of Wikipedia with respect to how Wikipedia deals with some problem behaviors in the user community and how this might impact the understanding of Wikipedia as an authoritative source. BackgroundWikipedia is a large online community which is engaged in the process of maintaining and expanding a huge (over 760,000 articles) encyclopedia that anyone can edit. It was founded in 2001 and bills itself as an encyclopedia even though it currently has no print version. Wikipedia allows other websites to copy their material and use it however they like. As such, there are hundreds of different versions of Wikipedia all over the Web. Hence, any edit made to a Wikipedia article has the potential to be disseminated all over the Web in a short time. As such, quality control is important for Wikipedia because if bad information becomes associated with the site it could severely damage the reputation of the entire project. Wikipedia is highly visible and accessible for anyone to edit. It attracts users who are interested in vandalizing the encyclopedia. This includes users who want to blank pages, add inappropriate words to articles in places they do not belong, create vanity articles about themselves, create bogus articles on non-existent people, insert erroneous information to already existing articles, add spam links to other sites on the Web, and so forth. Wikipedia is so huge and has thousands of individuals working on it at any given moment; the sheer complexity of attempting to detect the problematic users can be mind boggling. Wikipedia is operated entirely by volunteers and has found ways to defeat abusive behavior. Many in the Wikipedia community dedicate significant amounts of time to monitor edits that are made on pages. Anyone can see changes made anywhere in Wikipedia by looking at the Recent Changes page. There are thousands of edits made an hour and this page changes frequently. They can also view the edit history of any page and revert any edits which are not beneficial to the project. Frequently targeted articles, such as the George W. Bush page, are monitored for every edit made. Anyone who uses Wikipedia can view the edit history of every registered user or anonymous IP address to track and edit any changes. Literature Review and Theoretical UnderpinningsWikipedia is only a few years old and does not have a lot of references in scholarly literature. Despite this, it has already generated several articles. Wikipedia has generated a large number of online articles which are not indexed in scholarly literature. Most of these rehash the same points to a varying degree of quality. Most of these articles are about the authenticity of Wikipedia, how projects like Wikipedia will redefine knowledge, the pros and cons of Wikipedia, and strategies for contributing to Wikipedia. The Open Directory Project listed 33 quality online articles alone on Wikipedia in November 2005. One of the first scholarly looks at Wikipedia was by Ciffolilli (2003). She noted that Wikipedia had to deal with self-selection of editors and that this could cause problems. She also expressed concerns about the authenticity of information at the project and the lack of authorship being attributed on articles. However, as a whole, she thought that Wikipedia had a strong start and was successful. Ciffolilli (2003) found that Wikipedia was efficient at dealing with vandalism. She wrote, “I expected Wikipedia to be engaged in an endless war among reliable contributions and graffiti attacks that would have blocked the development of the Web site. In reality, that has not happened, basically because all changes made to any article are stored; it is possible to undo any unapproved modification with a single click. This makes the activity of littering a page extremely more expensive for an individual (in terms of time and reputation), than it is for anyone else.” O’Leary (2005) reviewed Wikipedia as a database. He concluded (with some reservations) that it was, indeed, an encyclopedia. On combating vandalism, he noted that, “Wikipedians are quick to self-police. Administrators, a group of several hundred Wikipedians; exercise a degree of authority when trouble occurs; egregious malefactors can even be exiled” (p. 53). As such, O’Leary’s views coincide with Ciffolilli’s (2003) the effective ability of the Wikipedia community to find abuse and protect the integrity of the articles. Others have noted that although Wikipedia does self-police, in the end some bad information gets through. Hence, it can never be considered as reliable as a physical print encyclopedia. Achterman (2005) related his experience with Wikipedia at a high school and found students did not bother to evaluate information. Wikipedia was easy to access and as such was a preferred method of acquiring facts. Even when students were shown Wikipedia’s flaws, they still preferred to use the source. Achterman concluded that this reaction by students to Wikipedia requires more emphasis on information literacy skills instruction at all levels of education. One concern that has been expressed about Wikipedia is the anonymous nature of the work. All edits are logged but authorship is not attributed to any article. As examining the credentials of an author is one method of determining validity for information, this removes an important measure of determining the validity of an article at Wikipedia. It also raises concerns for some authors. Miller (2005) wrote that she believed that Wikipedia deliberately blurred the lines between writing and reading and reduced the importance of the role of authorship. Miller (2005) acknowledging this potential problem, wrote about how Wikipedia protected the validity of the articles. She wrote, “Every page has a history of changes…However, the wiki doesn’t record this information to assert authorship per se. Rather, other readers use it to determine whether a page has changed since they last viewed it or to discover the identity of a writer who perhaps introduced an error or a spurious comment” (p. 40). Miller’s (2005) concern for authorship attribution at Wikipedia is an important observation. It touches on a theoretical approach used in this study. One way to look at Wikipedia is through the lens of postmodernism. Wikipedia is a democratic project allowing anyone regardless of age, race, sex, nationality, income level, etc., to edit. Postmodernism among other things believes that knowledge must be set to accommodate the multiple perspectives of class, gender, race, etc. Wikipedia allows all to contribute to the knowledge base. As such, Wikipedia is a very postmodern project. However, writing at Wikipedia is an anonymous act. Does the very postmodern nature of Wikipedia as evidenced by its openness and lack of authorship undermine the validity and authority of the project? There are some indications that this might be the case. In an article on evaluating websites in Phi Delta Kappa, Chamberlain (2002) wrote that authority is one of five main criteria listed for evaluating a website. She noted that, “Some pages do no list any individual’s name as the author of the textual material. In these cases, it is the sponsors…that assume responsibility for the content” (p. 14). In this case, Wikipedia must be considered the responsible party for all content authored at the site. However, Wikipedia does not warrant any article and the Wikipedia community actively seeks to change content endlessly and anonymously. Chamberlain (2002) listed several red flags which indicate questionable authority including anonymous page authors, no credentials listed for authors, and no provision for contacting authors. These three apply to Wikipedia to varying degrees. Wallace (2004) also wrote about the importance of authorship in determining the authority of curricular texts. He noted that textbooks are considered good tools in classrooms because of their authorship. He noted, “Teachers can question the accuracy and authority of a textbook, but they need not do so. On the Internet, however, little is authorized for classroom use, making the issue of authority ever present” (p. 477). Again, this lack of a clear authorship would appear to hinder the ability of Wikipedia to be seen as an authoritative source. MethodologyThis study was conducted to examine the formal abuse detection method of Wikipedia. In particular, is the Wikipedia community successful in combating problematic behavior from some users? A single and very active abuse detection page monitored by the Wikipedians was selected for observation. This is not the entirety of the methods used by Wikipedia to fight problem behavior, but it is a good representation of community efforts. In addition, the study examined how often reported vandal behavior was actually vandalism, how often abusive users got banned, and if the community was able to identify sock puppet accounts being used by contributors. It is important to note that this study does not seek to identify how often vandals and other abusers may be successful at Wikipedia. Invariably, some problematic behavior avoids detection or is dealt with by Wikipedia users individually. It would be nearly impossible to track down these instances for the dates studied. As such, this study looks at the formalized abuse detection system in place at Wikipedia and success and failures it may have. This study examines Wikipedia and how it protects the validity of articles by combating abusive behavior such as vandalism. This is important when considering the articles cited in the literature review. Early studies of Wikipedia have noted that it is possible to vandalize Wikipedia but they have also noted that the project has a good abuse detection system. Others writers have noted though that the anonymity of web resources is a warning flag which indicates a lack of authority. What does this mean for Wikipedia? Can analyzing the system of monitoring vandalism and other abuses help make a case for the credibility and authority of Wikipedia? The researcher believes ethnography was the most appropriate method to conduct this research. As edits at Wikipedia are logged and archived, it is possible to un-obtrusively observe community behavior. The ability to print out edit logs allows for an easy and accurate method to record all interactions that are of interest. Researchers can literally read verbatim anything of interest which has occurred at Wikipedia. The same nature of archived edits coupled with registered user names would allow for a case study of abusive users and those who monitor them. However, abusive users of Wikipedia have a tendency to use multiple accounts and anonymous IP addresses to edit which may change with every visit they make. This would make it difficult to determine if the case being studied was actually the same person. Combating problem behavior is only one aspect of the work that regular Wikipedia editors conduct everyday. Examining those editor logs may not be productive if the majority of the users’ actions are unrelated to problem behavior detection and correction. The Wikipedia community tends to be anonymous and use of a biographical method would be possible but difficult. Registered users create names to edit under but these mostly tend to be pseudonyms. Many editors have their own user page which give information about them but few give out their real names or contact information. In the case of prominent users who often fight abusive behavior, this is especially true. Vandals and spammers have been known to harass Wikipedians who revert their work. This harassment has been known to cross-over beyond Wikipedia and into the real life of the Wikipedian. A biographical study would require contact with an actual Wikipedia user involved with abuse detection and correction, this limitation would make it difficult to contact and conduct a biographical study in the timeframe allowed for by this study. For the same reason, a phenomenological study method and a grounded theory would also require actual contact (interviews) between the researcher and large number of Wikipedians compounding this problem. There are several pages at Wikipedia which are used by the community to monitor abusive behavior. These pages also allow for coordination of response to these abusive actions and allow the volunteer Wikipedia Administrators issue warnings and ban problematic users. One of these is the Vandalism in Progress (http://en.wikipedia.org/wiki/Wikipedia:Vandalism_in_progress) page. It is used by the Wikipedians to notify the entire community when they believe they have discovered a vandal at work. For several months, the researcher visited the Vandalism page and noted the activity there. The researcher made no contributions to the ongoing notices or conversations. As no edits were made, the researcher was completely invisible to the community and the presence of the researcher could have in no way influenced any of the activities occurring. One nice feature of Wikipedia is that all edits are logged and essentially stored forever. This makes it easy for researchers to investigate any aspect of the community and then find it again later. There is no need to record or copy anything. The researcher selected two date ranges to print out from the Vandalism in Progress page. These dates included October 4th through October 7th, 2005 and November 7th through November 28th, 2005. The researcher also made color coded notes on vandal reports based on the observations that were being made. AnalysisThe observation allowed several pieces of information about Wikipedia to be learned. It was discovered that not all reported vandal behavior is vandalism. There are detailed instructions on the page for which procedures to follow before reporting a vandalism incident. Despite this, many users skip steps and make reports on users who are not considered vandals yet. Users are supposed to be warned about their behavior first. Also, many new users make mistakes which might appear to be vandalism. In these cases, the advice is, “Please do not bite the newcomers.” Other users report disagreements over the content of an article as vandalism when the rules note this is not the case. Several examples of false reporting were noted. In one case an anonymous user complained that a user has vandalized a page relating to a company the user was associated with. Katefan0 responded by noting that a single bad edit is not considered vandalism in progress. In another case, Revolucion accused Rafterman of constantly vandalizing articles. Rafterman responded that he had not made a single vandal edit in the time since he was unblocked. Sherurcij agreed with Rafterman and that seemed to resolve the complaint on the Vandalism in Progress page. Over the period of the time frames studied there were a total of 16 false reports which others users often politely pointed out were not considered vandalism in progress. Considering the thousands of users who edit at Wikipedia every day, this seems to be a rather low number of false reports. It would seem that the clear instructions for when to report vandalism on the Vandalism in Progress page were followed by the majority of users. Another piece of information learned is that the Vandalism in Progress page does indeed lead to abusive users to be banned. There is a lot of talk about giving warnings to users first. It is in fact required before a posting is made to the vandalism page. Users who continue editing in an abusive pattern can be banned for a period of time. One example includes a discussion among three users (Adidas, Katefan0, and Sasquatch) about whether to ban an anonymous user’s IP address. Adidas wanted a ban but the Administrator Katefan0 thought that might prove disruptive. Katefan0 wrote that a block is warranted only if, “it rises to a level that editors can’t deal with, i.e., that it is disruptive.” Sasquatch (another Administrator) was less forgiving. Three hours later he banned the IP address with the note, “I am less lenient and so I blocked him after he started up again today, he has nothing to add.” Other blocks were requested for users vandalizing a variety of pages. These included vandals hitting articles such as Joan of Arc, Puerto Rico, sub-atomic particles, sacrament, Operation Ajax, and Hawaii. Other users were banned for vandalizing the talk pages and generally harassing other Wikipedia editors. A total of 39 user bans were handed out during the two timeframes covered by the study. Looking at the ban log, all of these bans were for four to 48 hours. Permanent bans can be handed out but are rare. It is worth noting that these are not all the vandals banned originated from notices on the Vandalism in Progress page. Most vandals were banned by administrators who noted persistent vandalism without reporting it to the Vandalism in Progress page. In fact, over 100 bans were handed out on the 28th of November according to the Block Log. As such, user reports about vandalism on the Vandalism in Progress page make up only a small portion of eventual user bans. Another fact picked up was that sockpuppets are common at Wikipedia. A sockpuppet (http://en.wikipedia.org/wiki/Wikipedia:Sock_puppet) is a user name created by an existing user. With a sockpuppet, the same person posts under multiple identities. This can be done for several reasons including to hide vandalism, offer an opinion multiple times in a dispute trying to appear as multiple users, and to get extra votes in a Wikipedia election. The Vandalism in Progress page has a subsection for reporting suspected sockpuppets who are vandalizing. One example includes a user who created at least 23 sockpuppet accounts with the intent to vandalize the Michael Jackson article. Bison Augustus was accused of having over 30 sock puppet accounts based on the fact that all the accounts originate from the same New Zealand IP block, they all were using racist language, and that all participating in the edit war on the article dealing with Japanese War Crimes. During the two timeframes noted, there were 30 sock puppet reports to the vandalism page. What does all of this data mean? Once conclusion is that the Vandalism in Progress page is one method that the Wikipedia community uses to detect and eliminate problematic edits. In this sense, the page is successful in doing that. The time timeframes studied had hundreds of reports of vandalism and this resulted in other users springing into action to fix the damage and lead to many users being banned. Further, this page was able to identify a high number of likely sock puppet accounts letting users know when another user was using multiple accounts and log-ins to possibly abuse Wikipedia. As the Vandalism in Progress is just a small microcosmic aspect of the larger Wikipedia abuse and vandalism detection system, it is difficult to say absolutely that Wikipedia is good at protecting the integrity of articles from vandals. However, it is a good indicator that a significant amount of thought and effort is ongoing by the Wikipedians in this matter. Clearly, vandals who do not understand how Wikipedia works are liable to be discovered and identified on the Vandalism in Progress page in very short order. One concern about validity of the articles at Wikipedia is the ability of clever vandals to make changes to articles without getting caught. Once a user learns how most vandalism is detected, actions can be taken to counter this. For example, as edit logs are kept on every user, there is nothing from making a new user account every time a user logs in. Logging in with AOL or EarthLink will give the user a new IP every time for anonymous editing. If they make only one or two edits and then abandon the account, future vandalism by subsequent accounts will not give prior vandalism away. Also, if the vandal makes small changes to out-of-the-way articles or adds a seemingly related spam link to an article while also making a legitimate edit to improve the article, the vandalism may survive for a long period of time. Long established and trusted members do not have their edits scrutinized on a regular basis. It would be fairly easily for these users to make infrequent and hard to detect changes to vandalize or insert their own biases in articles. The authors who have written about Wikipedia including Ciffolilli (2003), O’Leary (2005), and Miller (2005) all had concerns about the validity of Wikipedia but were reassured that many of the methods used by the Wikipedia community were effective in eliminating much of the bad information and vandalism. The results of this study indicate these authors are correct in their assessment and there is a strong and viable reaction to problematic behavior at Wikipedia. The community is quite capable of defending itself from obvious vandals. This study did not examine other methods used by Wikipedia such as users individually detecting and fixing vandalism without reporting it, the banning of spam domains so that they cannot be added, and the frequency and the length of time in which vandals edits are made successfully, this study cannot answer the underlying question of whether Wikipedia can truly be considered authoritative. In addition, questions about the legitimacy of any anonymously written resource are valid and changes in technology that have resulted in the postmodern-like existence of Wikipedia do not go away just because a method of abuse detection is in existence even if it were 100% effective. ReferencesAchterman, D. (2005). Surviving Wikipedia: Improving student search habits through information literacy and teacher collaboration. Knowledge quest, 33(5), 38-40. Chamberlain, E. (2002). Evaluating website content. Phi delta kappa, no. 492, 7-43. Ciffolilli, A. (2003). Phantom authority, self-selective recruitment and retention of Members in virtual communities: The case of Wikipedia. Retrieved 12 October 2005, from http://www.firstmonday.dk/issues/issue8_12/ciffolilli/index.html. Miller, N. (2005). Wikipedia and the disappearing “author”. Etc., 62(1), 37-40. O’Leary, M. (2005). Wikipedia: Encyclopedia or not? Information today, 22(8), 49, 53. Wallace, R. M. (2004). A framework for understanding teaching with the internet. American educational research journal, 41(2), 447-488.
|
||