How do we speak with ALEXA : Subjective and objective assessments of changes in speaking style between HC and HH conversations

Nowadays a diverse set of technical solutions is implemented to detect if a system should react to an uttered speech command. Unfortunately, the preferred methods of wake words can result in confusions e.g. when the word has been said but no interaction with the system was intended by the user. Therefore, technical systems should be able to detect their addressing by itself. In order to achieve this goal research concentrates on analyzing speech. Analysing the speaker’s self-assessment of his speech characteristics while addressing a system can provide further information, which up to now wasn’t considered in the field. Utilizing a new generated voice assistant conversation corpus, this paper presents insights of the participant’s addressee behavior and correlates objective and subjective changes in speaking style characteristics between human-human and human-computer conversations. It could be shown that users could recognize changes in some of their speech characteristics. Furthermore, the objective identifiable changes are heavily dependent on the type of interaction. Mostly affected are intonation and stress patterns as well as melody and rhythm patterns. The presence of a confederate speaker does not reveal differences on the addressing behavior.

