Inproceedings3283
Whose article is it anyway? - Detecting authorship distribution in Wikipedia articles over time with WIKIGINI
Whose article is it anyway? - Detecting authorship distribution in Wikipedia articles over time with WIKIGINI
Published: 2012
Juli
Buchtitel: Proceedings of the Wikipedia Academy 2012
Verlag: Online-Publikation
Erscheinungsort: Berlin
Organisation: Wikipedia Academy 2012, Wikimedia Deutschland
Referierte Veröffentlichung
BibTeX
Kurzfassung
Presentation on video:
http://vimeo.com/wikimediade/paper-session3#t=3309
In this work, we present a novel approach to detecting authorship of words in Wikipedia, which significantly outperforms the baseline method in terms of accuracy. This is achieved by reducing the necessary word-based text-to-text comparisons, which are the most fallible steps in the process. We moreover argue that the concentration of words to just a few authors can be an indicator for a lack of quality and/or neutrality in an article. To provide an aggregated measure of the concentration, we calculate a gini coefficient for each revision of an article based on our word-author-assignments. The coefficient development over time in an article is visualized and provided online as an easily accessible and useful tool to investigate how the content of an article evolved. We present examples where the gini curve gives useful insights into differences between articles and may help to spot crucial events in the past evolution of an article.
Download: Media:Paper.pdf
Weitere Informationen unter: Link