A Multidimensional Dataset Based on Crowdsourcing for Analyzing and Detecting News Bias

Michael Färber, Victoria Burkard, Adam Jatowt, Lim Sora

Published: 2020

Buchtitel: Proceedings of the 29th ACM International Conference on Information and Knowledge Management (CIKM'20)
Verlag: ACM
Erscheinungsort: Galway, Ireland

Referierte Veröffentlichung

BibTeX

Kurzfassung
The automatic detection of bias in news articles can have a high impact on society because undiscovered news bias may influence the political opinions, social views, and emotional feelings of readers. While various analyses and approaches to news bias detection have been proposed, large data sets with rich bias annotations on a fine-grained level are still missing. In this paper, we firstly aggregate the aspects of news bias in related works by proposing a new annotation schema for labeling news bias. This schema covers the overall bias, as well as the bias dimensions (1) hidden assumptions, (2) subjectivity, and (3) representation tendencies. Secondly, we propose a methodology based on crowdsourcing for obtaining a large data set for news bias analysis and identification. We then use our methodology to create a dataset consisting of more than 2,000 sentences annotated with 43,000 bias and bias dimension labels. Thirdly, we perform an in-depth analysis of the collected data. We show that the annotation task is difficult with respect to bias and specific bias dimensions. While crowdworkers' labels of representation tendencies correlate with experts' bias labels for articles, subjectivity and hidden assumptions do not correlate with experts' bias labels and, thus, seem to be less relevant when creating data sets with crowdworkers. The experts' article labels better match the inferred crowdworkers' article labels than the crowdworkers' sentence labels. The crowdworkers' countries of origin seem to affect their judgements. In our study, non-Western crowdworkers tend to annotate more bias either directly or in the form of bias dimensions (e.g., subjectivity) than Western crowdworkers do.

ISBN: 978-1-4503-6859-9/20/10
Download: Media:NewsBias-CIKM2020.pdf
DOI Link: 10.1145/3340531.3412876

Projekt

Digilog@bw

Verknüpfte Datasets

NewsBias2020

Forschungsgruppe

Web Science

Forschungsgebiet

Information Retrieval, Text Mining, Natürliche Sprachverarbeitung, Data Mining, Künstliche Intelligenz, Data Science

Inproceedings3839

A Multidimensional Dataset Based on Crowdsourcing for Analyzing and Detecting News Bias

A Multidimensional Dataset Based on Crowdsourcing for Analyzing and Detecting News Bias