Reranked aligners for interactive transcript correction

Benoit Favre; Mickael Rouvier; Frédéric Béchet

Communication Dans Un Congrès Année : 2014

Reranked aligners for interactive transcript correction

(1, 2) , (3, 4) , (1)

1
2
3
4

Benoit Favre

Fonction : Auteur
PersonId : 4978
IdHAL : benoit-favre
ORCID : 0000-0002-9777-4613
IdRef : 115288511

Laboratoire d'informatique Fondamentale de Marseille - UMR 6166

Traitement Automatique du Langage Ecrit et Parlé

Mickael Rouvier

Fonction : Auteur

Laboratoire Informatique d'Avignon

Laboratoire d'informatique Fondamentale de Marseille

Frédéric Béchet

Fonction : Auteur
PersonId : 12253
IdHAL : frederic-bechet
IdRef : 070531730

Laboratoire d'informatique Fondamentale de Marseille - UMR 6166

Résumé

Clarification dialogs can help address ASR errors in speech-to-speech translation systems and other interactive applications. We propose to use variants of Levenshtein alignment for merging an errorful utterance with a targeted rephrase of an error segment. ASR errors that might harm the alignment are addressed through phonetic matching, and a word embedding distance is used to account for the use of synonyms outside targeted segments. These features lead to a relative improvement of 30% of word error rate on ASR output compared to not performing the clarification. Twice as many utterance are completely corrected compared to using basic word alignment. Furthermore, we generate a set of potential merges and train a neural network on crowd-sourced rephrases in order to select the best merger, leading to 24% more instances completely corrected. The system is deployed in the framework of the BOLT project.

Mots clés

Error correction Dialog systems ASR error detection Reranking Levenshtein alignment

Domaines

Informatique et langage [cs.CL]

Benoit Favre : Connectez-vous pour contacter le contributeur

https://amu.hal.science/hal-01194237

Soumis le : samedi 5 septembre 2015-11:12:38

Dernière modification le : vendredi 22 mars 2024-18:24:04

Dates et versions

hal-01194237 , version 1 (05-09-2015)

Identifiants

HAL Id : hal-01194237 , version 1

Citer

Benoit Favre, Mickael Rouvier, Frédéric Béchet. Reranked aligners for interactive transcript correction. ICASSP2014 - Speech and Language Processing (ICASSP2014 - SLTC), 2014, Florence, Italy. ⟨hal-01194237⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-AVIGNON UNIV-TLN LIF CNRS UNIV-AMU EC-MARSEILLE LIA LIS-LAB

133 Consultations

0 Téléchargements

Reranked aligners for interactive transcript correction

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager