Proceedings of the GermEval 2017 – Shared Task on Aspect-based Sentiment in Social Media Customer Feedback

In the connected, modern world, customer feedback is a valuable source for insights on the quality of products or services. This feedback allows other customers to benefit from the experiences of others and enables businesses to react on requests, complaints or recommendations. However, the more people use a product or service, the more feedback is generated, which results in the major challenge of analyzing huge amounts of feedback in an efficient, but still meaningful way.

Aspect-based Sentiment Analysis is an important task to analyze customer feedback and a growing number of Shared Tasks exist for various languages. However, these taks lack large-scale German datasets. Thus, we present a shared task on automatically analyzing customer reviews about “Deutsche Bahn” – the German public train operator with about two billion passengers each year. We have annotated more than 26,000 documents and present four sub-tasks that represent a complete classification pipeline (relevance, sentiment, aspect classification, opinion target extraction).

The results indicate that the public transport domain offers challenging tasks. E.g., the large number of aspects – in combination with an almost Zipfian label distribution of real user feedback – leads to label bias problems and creates strong baselines. We observe that the usage of extensive preprocessing, large sentiment lexicons, and the connection of neural and more traditional classifiers are advantageous strategies for the formulated tasks. The Shared Task is a first step in sentiment analysis for this domain.

For the GermEval 2017 Shared Task, we received 8 submissions. One submission was withdrawn from the proceedings. The dataset and the proceedings are available from the task website.

Inhalt

Zitieren

Zitierform:
Zitierform konnte nicht geladen werden.

Rechte

Nutzung und Vervielfältigung:
Dieses Werk kann unter einer
CC BY-NC 4.0 LogoCreative Commons Namensnennung - Nicht kommerziell 4.0 Lizenz (CC BY-NC 4.0)
genutzt werden.