Thesis Defense of Thibault Bañeras-Roux – 17/01/2025
Title : Analysis and understanding of the evaluation of automatic speechrecognition systems: towards metrics integrating human perception. Date: Friday January 17 at 2:00 pm Place : Amphithéâtre du bâtiment 34, LS2N, Campus Lombarderie, 2 chemin de laHoussinière 44000 Nantes. The defense will be presented in French. Abstract : Today, word error rate remains the most widely used metric forevaluating automatic speech recognition (ASR) systems. However, thismetric has limitations in terms of correlation with human perception andfocuses only on spelling preservation. In this thesis, we proposealternative metrics that can evaluate spelling, but also grammar,semantics or phonetics. To analyze the ability of these metrics to reflect transcript qualityfrom the user’s point of view, we built up a dataset named HATS,annotated by 143 French-speaking subjects. Each annotator examined 50triplets, made up of a manual reference transcription and two hypothesesfrom different ASR systems, to determine which hypothesis was, in theiropinion, the most faithful. By calculating the number of times a metric agrees with the annotators’choices, we obtain a measure of its correlation with human perception.This corpus can thus be used to rank different metrics according to thejudgment of a human reader. Our results show that SemDist, a metricbased on BERT’s semantic representations for comparing Plus d'infos