NewsBias2020: Unterschied zwischen den Versionen

Aktuelle Version vom 3. September 2020, 21:08 Uhr

NewsBias2020

Datensatz zur Medienverzerrung bestehend aus mehr als 2.000 Sätzen und 43.000 Labels

Kontaktperson: Michael Färber

https://doi.org/10.5281/zenodo.3885351

Forschungsgruppe: Web Science

Beschreibung

The automatic detection of bias in news articles can have a high impact on society because undiscovered news bias may influence the political opinions, social views, and emotional feelings of readers. While various analyses and approaches to news bias detection have been proposed, large data sets with rich bias annotations on a fine-grained level are still missing. We aggregate the aspects of news bias in related works by proposing a new annotation schema for labeling news bias. This schema covers the overall bias, as well as the bias dimensions (1) hidden assumptions, (2) subjectivity, and (3) representation tendencies. Secondly, we propose a methodology based on crowdsourcing for obtaining a large data set for news bias analysis and identification. We then use our methodology to create a dataset consisting of more than 2,000 sentences annotated with 43,000 bias and bias dimension labels. Thirdly, we perform an in-depth analysis of the collected data. We show that the annotation task is difficult with respect to bias and specific bias dimensions. While crowdworkers' labels of representation tendencies correlate with experts' bias labels for articles, subjectivity and hidden assumptions do not correlate with experts' bias labels and, thus, seem to be less relevant when creating data sets with crowdworkers. The experts' article labels better match the inferred crowdworkers' article labels than the crowdworkers' sentence labels. The crowdworkers' countries of origin seem to affect their judgements. In our study, non-Western crowdworkers tend to annotate more bias either directly or in the form of bias dimensions (e.g., subjectivity) than Western crowdworkers do.

Involvierte Personen

Michael Färber

Publikationen

inproceedings

Michael Färber, Victoria Burkard, Adam Jatowt, Lim Sora
A Multidimensional Dataset Based on Crowdsourcing for Analyzing and Detecting News Bias
Proceedings of the 29th ACM International Conference on Information and Knowledge Management (CIKM'20), ACM, Galway, Ireland
(Details)

↑ top

Projekte

digilog@bw

@@ Zeile 2: / Zeile 2: @@
 |Forschungsgruppe=Web Science
 |name=NewsBias2020
+|short description EN=Dataset consisting of more than 2,000 sentences annotated with 43,000 bias and bias dimension labels.
+|short description DE=Datensatz zur Medienverzerrung bestehend aus mehr als 2.000 Sätzen und 43.000 Labels
 |contact persons=Michael Färber
 |link=https://doi.org/10.5281/zenodo.3885351