Neural Representations of Dialogical History for Improving Upcoming Turn Acoustic Parameters Prediction - Aix-Marseille Université Accéder directement au contenu
Communication Dans Un Congrès Année : 2020

Neural Representations of Dialogical History for Improving Upcoming Turn Acoustic Parameters Prediction

Résumé

Predicting the acoustic and linguistic parameters of an upcoming conversational turn is important for dialogue systems aiming to include low-level adaptation with the user. It is known that during an interaction speakers could influence each other speech production. However, the precise dynamics of the phenomena is not well-established, especially in the context of natural conversations. We developed a model based on an RNN architecture that predicts speech variables (Energy, F0 range and Speech Rate) of the upcoming turn using a representation vector describing speech information of previous turns. We compare the prediction performances when using a dialogical history (from both participants) vs. monological history (from only upcoming turn's speaker). We found that the information contained in previous turns produced by both the speaker and his interlocutor reduce the error in predicting current acoustic target variable. In addition the error in prediction decreases as increases the number of previous turns taken into account.
Fichier principal
Vignette du fichier
2785.pdf (379.66 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte

Dates et versions

hal-03224194 , version 1 (12-05-2021)

Identifiants

Citer

Simone Fuscone, Benoit Favre, Laurent Prevot. Neural Representations of Dialogical History for Improving Upcoming Turn Acoustic Parameters Prediction. Interspeech 2020, Oct 2020, Virtual (Shangai), China. pp.4203-4207, ⟨10.21437/interspeech.2020-2785⟩. ⟨hal-03224194⟩
95 Consultations
157 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More