Measuring the Reliability of Hate Speech Annotations: the Case of the European Refugee Crisis

Roß, Björn; Rist, Michael; Carbonell, Guillermo; Cabrera, Benjamin; Kurowsky, Nils; Wojatzki, Michael

doi:10.17185/duepublico/42132

Tagungsbeitrag September 2016 CC BY-ND 4.0

Veröffentlicht

Measuring the Reliability of Hate Speech Annotations : the Case of the European Refugee Crisis

Roß, Björn ; Rist, Michael; Carbonell, Guillermo ; Cabrera, Benjamin ; Kurowsky, Nils; Wojatzki, Michael

Some users of social media are spreading racist, sexist, and otherwise hateful content. For the purpose of training a hate speech detection system, the reliability of the annotations is crucial, but there is no universally agreed-upon definition. We collected potentially hateful messages and asked two groups of internet users to determine whether they were hate speech or not, whether they should be banned or not and to rate their degree of offensiveness. One of the groups was shown a definition prior to completing the survey. We aimed to assess whether hate speech can be annotated reliably, and the extent to which existing definitions are in accordance with subjective ratings. Our results indicate that showing users a definition caused them to partially align their own opinion with the definition but did not improve reliability, which was very low overall. We conclude that the presence of hate speech should perhaps not be considered a binary yes-or-no decision, and raters need more detailed instructions for the annotation.

Vorschau

Einordnung

Konferenz:

NLP4CMC III: 3rdWorkshop on Natural Language Processing for Computer-Mediated Communication 22 September 2016

Datum der Veröffentlichung:

09.2016

URN:

urn:nbn:de:hbz:464-20161109-072455-8

DOI:

10.17185/duepublico/42132

Sprache:

Englisch

Ressourcentyp:

Text

Schlagwörter:

hate speech; hate speech detection; twitter; social media; social media analytics; natural language processing

Kollektion:

E-Publikationen

Dewey Dezimal-Klassifikation:

000 Informatik, Informationswissenschaft, allgemeine Werke

Sachgruppen der Deutschen Nationalbibliographie:

004 Informatik

Link URL:

https://sites.google.com/site/nlp4cmc2016/

Link URL:

https://www.linguistics.ruhr-uni-bochum.de/bla/

Einrichtung:

Fakultät für Ingenieurwissenschaften, Informatik und Angewandte Kognitionswissenschaft

Informationen zur Erstveröffentlichung:

Roß, B., Rist, M., Carbonell, G., Cabrera, B., Kurowsky, N., Wojatzki, M., 2016. Measuring the Reliability of Hate Speech Annotations: the Case of the European Refugee Crisis.

Published in: Michael Beißwenger, Michael Wojatzki and Torsten Zesch (Eds.): NLP4CMC III: 3rdWorkshop on Natural Language Processing for Computer-Mediated Communication, 22 September 2016. (Bochumer Linguistische Arbeitsberichte ; 17).
https://www.linguistics.ruhr-uni-bochum.de/forschung/arbeitsberichte/17.pdf, pp. 6-9

Published: September 2016

auf die Merkliste