Le prochain SLG meeting se tiendra en salle S1 le jeudi 21 décembre, de 12h00 à 13h00.
Nous aurons le plaisir d’accueillir St Germes BENGONO OBIANG, doctorant travaillant sur le traitement de la parole, plus particulièrement sur la reconnaissance des tons dans les langues peu dotées. Il est encadré par Norbert TSOPZE et Paulin MELATAGIA de l’Université de Yaoundé 1, ainsi que par Jean-François BONASTRE et Tania JIMENEZ du LIA.
Résumé : Many sub-Saharan African languages are categorized as tone languages and for the most part, they are classified as low resource languages due to the limited resources and tools available to process these languages. Identifying the tone associated with a syllable is therefore a key challenge for speech recognition in these languages. We propose models that automate the recognition of tones in continuous speech that can easily be incorporated into a speech recognition pipeline for these languages. We have investigated different neural architectures as well as several features extraction algorithms in speech (Filter banks, Leaf, Cestrogram, MFCC). In the context of low-resource languages, we also evaluated Wav2vec models for this task. In this work, we use a public speech recognition dataset on Yoruba. As for the results, using the combination of features obtained from CS and FB, we obtain a minimum TER of 19.54% while the evaluations of the models using Wav2vec 2.0, we have a TER of 17.72% demonstrating that the use of Wav2vec provides better performance than the models used in the literature for tone identification on low-resource languages.