Cross-Lingual Content Scoring

Horbach, Andrea; Stennmanns, Sebastian; Zesch, Torsten

doi:10.18653/v1/W18-0550

Tagungsbeitrag 2018 CC BY 4.0

Veröffentlicht

Cross-Lingual Content Scoring

Horbach, Andrea ; Stennmanns, Sebastian; Zesch, Torsten

We investigate the feasibility of cross-lingual content scoring, a scenario where training and test data in an automatic scoring task are from two different languages. Cross-lingual scoring can contribute to educational equality by allowing answers in multiple languages. Training a model in one language and applying it to another language might also help to overcome data sparsity issues by reusing trained models from other languages. As there is nosuitable dataset available for this new task, wecreate a comparable bi-lingual corpus by extending the English ASAP dataset with German answers. Our experiments with cross-lingual scoring based on machine-translating either training or test data show a considerable drop in scoring quality.

Vorschau

Einordnung

Konferenz:: NAACL HLT 2018, Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications, New Orleans, Louisiana, June 5, 2018
Datum der Veröffentlichung:: 2018
URN:: urn:nbn:de:hbz:464-20211019-110432-4
DOI:: 10.18653/v1/W18-0550
Sprache:: Englisch
Ressourcentyp:: Text
Kollektion:: E-Publikationen
Sachgruppen der Deutschen Nationalbibliographie:: 004 Informatik
Einrichtung:: Fakultät für Ingenieurwissenschaften, Informatik und Angewandte Kognitionswissenschaft, Informatik, Sprachtechnologie
Informationen zur Erstveröffentlichung:: Horbach, A., Stennmanns, S., Zesch, T. (2018). Cross-Lingual Content Scoring. In: Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 410-419. Association for Computational Linguistics. DOI: https://doi.org/10.18653/v1/W18-0550