SLG – Laboratoire Informatique d’Avignon

LIA Doctoral Fellowship 2025

31 January 2025

The LIA’s 2025 doctoral grant has been awarded to the SLG team. Several topics have been proposed and are available on the adum.fr platform https://adum.fr/as/ed/proposition.pl?site=avignon : Interested candidates should apply on the adum.fr platform. Before doing so, however, it is strongly recommended that you contact the researchers proposing the topics in order to discuss with them. The auditions will be organised by Doctoral School 536 according to the procedures indicated on https://univ-avignon.fr/recherche/le-doctorat/je-souhaite-preparer-un-doctorat/. They will take place at the beginning of June, the final date will be communicated to the selected candidates by the supervisors.

Thesis Defense of Thibault Bañeras-Roux – 17/01/2025

15 January 2025

Title : Analysis and understanding of the evaluation of automatic speechrecognition systems: towards metrics integrating human perception. Date: Friday January 17 at 2:00 pm Place : Amphithéâtre du bâtiment 34, LS2N, Campus Lombarderie, 2 chemin de laHoussinière 44000 Nantes. The defense will be presented in French. Abstract : Today, word error rate remains the most widely used metric forevaluating automatic speech recognition (ASR) systems. However, thismetric has limitations in terms of correlation with human perception andfocuses only on spelling preservation. In this thesis, we proposealternative metrics that can evaluate spelling, but also grammar,semantics or phonetics. To analyze the ability of these metrics to reflect transcript qualityfrom the user’s point of view, we built up a dataset named HATS,annotated by 143 French-speaking subjects. Each annotator examined 50triplets, made up of a manual reference transcription and two hypothesesfrom different ASR systems, to determine which hypothesis was, in theiropinion, the most faithful. By calculating the number of times a metric agrees with the annotators’choices, we obtain a measure of its correlation with human perception.This corpus can thus be used to rank different metrics according to thejudgment of a human reader. Our results show that SemDist, a metricbased on BERT’s semantic representations for comparing Plus d'infos

Contrat doctoral Agorantic 2025

1 January 2025

Une allocation doctorale 2025 pour le LIA a été attribuée à l’équipe SLG par la FR Agorantic . Le sujet proposé est disponible sur la plateforme adum.fr https://adum.fr/as/ed/proposition.pl?site=avignon : Les candidats et candidates intéressées doivent postuler sur la plateforme adum.fr. Mais, avant cela, il est fortement recommander de contacter les chercheurs et chercheuses qui proposent les sujets, afin d’en discuter avec eux et elles. Les auditions seront organisées par la FR Agorantic selon les modalités générales indiquées sur https://agorantic.univ-avignon.fr/en/thesis/. Elles auront lieu début juin, la date définitive sera communiqué aux candidats retenus par les encadrants.

SLG seminar – Ana Montalvo – 11/06/2024

4 November 2024

Title: Exploring Short-Duration Spoken Language Recognition: Insights from CENATAV Date: 11/06/2024 – 11AM Room: S4 Abstract : This presentation will introduce the Advanced Technologies Application Center (CENATAV), outlining its core mission and research areas, with a focus on the work of its Voice Processing Group. We will discuss the challenges of conducting research with limited access to high-performance computing resources and large datasets, emphasizing our recent work on spoken language recognition in very short-duration audio signals. Language: English

PhD Defense of Timothée Dhaussy – 10/21/2024

18 October 2024

Date: 21th of october 2024 at 2PM Place: Thesis room, at the Hannah Arendt campus of Avignon Université. The videoconference link is the following: https://bbb.univ-avignon.fr/rooms/vtj-xje-xex-gyw/join . The jury will be composed of: Dr Aurélie Clodic, LAAS-CNRS, RapporteurePr Julien Pinquier, Université de Toulouse, IRIT, RapporteurPr Laurence Devillers, Sorbonne Université, LISN-CNRS, ExaminatricePr Olivier Alata, Université Jean Monnet, Laboratoire Hubert Curien, ExaminateurPr Fabrice Lefèvre, Avignon Université, LIA, Directeur de thèseDr Bassam Jabaian, Avignon Université, LIA, Co-encadrant Title: Proactive multimodal human-robot interaction in a hospital In this thesis, we focus on creating a proactive multimodal system for the social robot Pepper, designed for a hospital waiting room. To achieve this, we developed a cognitive human-robot interaction architecture, based on a continuous loop of perceptions, representation, and decision-making. The flow of perceptions is divided into two steps: first, retrieving data from the robot’s sensors, and then enriching it through refining modules. A speaker diarization refining module, based on a Bayesian model of fusion of audio and visual perceptions through spatial coincidence, was integrated. To enable proactive action, we designed a model analyzing the users’ availability for interaction in a waiting room. The refined perceptions are then organized and aligned to create a constantly updated representation of Plus d'infos

PhD defense of Lucas Druart – 24/10/2024

16 October 2024

Date: Jeudi 24 octobre à 15h Lieu: salle des thèses sur le campus Hannah Arendt. Vous pouvez également y assister à distance si vous le souhaitez grâce au lien suivant : https://v-au.univ-avignon.fr/live/bbb-soutenance-these-l-druart-24-octobre-2024/. Title : Towards Contextual and Structured Spoken Task-Oriented Dialogue Understanding Abstract : Accurately understanding users’ requests is key to provide smooth interactions with spoken Task-Oriented Dialogue (TOD) systems. Traditionally such systems adopt cascade approaches which combine an Automatic Speech Recognition (ASR) component with a Natural Language Understanding (NLU) one. Yet, those systems still have trouble to accurately map complex user’s request with their internal representation. Recent work highlights potential directions to improve those systems. On the one hand, end-to-end approaches have successfully enhanced Spoken Language Understanding (SLU) system’s performance. Indeed, they provide more robust and accurate predictions by leveraging joint optimization and paralinguistic information. On the other hand, textual datasets propose fine-grained semantic representations. Such representations seem more adequate to represent user’s complex requests. This thesis explores both directions towards contextual and structured spoken task-oriented dialogue understanding. We first conduct a preliminary study dedicated to getting the grips of SLU in the context of TOD. We designed a cascade approach to perform spoken Dialogue State Tracking (DST) on MultiWOZ. Our approach ranked first in Plus d'infos

PhD defense of Gaelle Laperrière – 09/09/2024

3 September 2024

Date: 9th of Septembre 2024 Time : 3PM Place: Ada Lovelace CERI’s amphitheater, at the Jean-Henri Fabre campus of Avignon Université. The jury will be composed of: Alexandre Allauzen, PR at Université Paris Dauphine-PSL, LAMSADE – Rapporteur Benoit Favre, PR at Aix-Marseille Université, LIS – Rapporteur Marco Dinarelli, CR at CNRS, LIG – Examiner Nathalie Camelin, MCF at Le Mans Université, LIUM – Examiner Philippe Langlais, PR at Université de Montréal, DIRO, RALI – Examiner Fabrice Lefèvre, PR at Avignon Université, LIA – Examiner Yannick Estève, PR at Avignon Université, LIA – Thesis director Sahar Ghannay, MCF at Université Paris-Saclay, LISN, CNRS – Thesis co-supervisor Bassam Jabaian, MCF at Avignon Université, LIA – Thesis co-supervisor Title: Spoken Language Understanding in a multilingual context This thesis falls within the scope of Deep Learning applied to Spoken Language Understanding. Its primary objective is to leverage existing data of large resourced annotated languages for speech semantics to develop effective understanding systems in low resourced languages. In recent years, significant advances were made in the field of automatic speech translation through new approaches that converge audio and textual modalities, the latter benefiting from vast amounts of data. By visualizing spoken language understanding as a translation task from a natural Plus d'infos

ANR PANTAGRUEL Project

21 August 2024

Modèles de langue multimodaux et inclusifs pour le français général et clinique Le projet Pantagruel (ANR 23-IAS1-0001) ambitionne de développer et évaluer des modèles linguistiques multimodaux (écrit, oral, pictogrammes) inclusifs pour le français. Il mobilise des chercheurs de diverses disciplines telles que l’informatique, le traitement du signal, la sociologie et la linguistique pour assurer des résultats fiables et variés. Liste des partenaires : Responsable Scientifique pour le LIA : Yannick Estève (Equipe SLG) Date Début : 2023-11-20 Date Fin : 2026-11-20 En Savoir Plus

ANR MALADES Project

4 June 2024

Grands modèles de langue adaptables et souverains pour le domaine médical français The recent arrival of Large Language Models (LLMs) and their associated tools for the general public reveals major challenges for society. Among the many fields that are, or will be, impacted by these generative models, the biomedical field is one of those that currently attract the attention of industrialists, researchers, but also the general public. Indeed, the need for tools and potential applications seems immense, whether, for example, at the level of the processing of textual documents, medical imaging, or even voice interaction. Due to the sensitive nature of the personal data handled and the fears of society associated with decision support tools, work in natural language processing (NLP) must innovate by addressing the issues inherent in this field. As part of the MALADES project, we presented innovative approaches for the integration of LLM in health centers. The aim is to equip these centers with NLP tools derived from LLMs and adapted for the biomedical field while maintaining sovereignty of the models and complete control of their health data. The work we carry out focuses on four areas of research: 1) the study of the legal and ethical Plus d'infos

SLG Seminar – Tanja Schultz – 25/04/2024

22 April 2024

On Thursday 25 April at 11am, we will host a talk from Prof. Tanja Schultz on « Neural Signal Interpretation for Spoken Communication ». The room will be defined later. Please find below a short abstract and bio from Prof. Tanja Schultz. Abstract: This talk presents advancements in decoding neural signals, providing further insights into the intricacies of spoken communication. Delving into both speech production and speech perception, we discuss low latency processing of neural signals from surface EEG, stereotactic EEG, and intracranial EEG using machine learning methods. Practical implications and human-centered applications are considered, including silent speech interfaces, neuro-speech prostheses, and the detection of auditory attention and distraction in communication. This presentation aims to spark curiosity about the evolving landscape of neural signal interpretation and its impact on the future of spoken communication. Bio: Tanja Schultz received the diploma and doctoral degrees in Informatics from University of Karlsruhe and a Master degree in Mathematics and Sport Sciences from Heidelberg University, both in Germany. Since 2015 she is Professor for Cognitive Systems of the Faculty of Mathematics & Computer Science at the University of Bremen, Germany. Prior to Bremen she spent 7 years as Professor for Cognitive Systems at KIT (2007-2015) and over 20 years as Plus d'infos

1 2 3 4 »