SLG – Page 3 – Laboratoire Informatique d’Avignon

PhD defense of Sondes Abderrazek – 2 May 2023

2 May 2023

Date: 2nd of May at 2:00 pm. Place: Avignon at the Centre d’Enseignement et de Recherche en Informatique (Ada Lovelace Auditorium) The jury members: Title: Assessment of Speech Intelligibility using Deep Learning: Towards Enhanced Interpretability in Clinical Phonetics. Abstract: Speech intelligibility is an essential component of effective communication. It refers to the degree to which a speaker’s intended message can be understood by a listener. This capacity can be hampered as a consequence of speech disorders, which results in a reduced quality of life for individuals. In the case of Head and Neck Cancer (HNC), speech may be affected due to the presence of tumors in the speech production system, but the main cause of speech impairment is typically the tumor treatment including surgery, radiotherapy, chemotherapy, or a combination of these treatments. In such cases, the evaluation of speech quality is crucial to assess the communication deficit of patients and develop targeted treatment plans. In clinical practice, perceptual measures are considered the gold standard for assessing speech disorders. Although these measures are widely used, they suffer from several limitations, the most important of which is their subjectivity. Consequently, the automatic assessment of speech disorders has emerged as a promising alternative to perceptual Plus d'infos

ANR TRADEF Project

1 January 2023

Tracking and Detecting Fake News and Deepfakes in Arabic Social Networks The 4th Generation War (4GW) is known as an information war involving non-military populations. It is conducted by national or transnational groups following ideologies based on cultural beliefs, religion, economic or political interests, aiming to sow chaos in a targeted region globally. In 1989, authors discussing the 4th generation war, some of whom were military experts, explained that it would be widespread and challenging to define in the decades to come. With the emergence of social networks, the previously vague battlefield found a platform for 4GW. One of its penetration points is the extensive use of social networks to manipulate opinions, aiming to shape the targeted region’s perspective to accept a certain state of affairs and render it socially and politically acceptable. Much like the 4th generation war, cognitive warfare aims to blur comprehension mechanisms regarding politics, economy, religion, etc. Its consequence is destabilizing and weakening the adversary. This cognitive war targets what is assumed to be the enemy’s brain, altering reality by flooding the adversary’s population with misleading information, rumors, or manipulated videos. Furthermore, the proliferation of social bots today enables automated dissemination of disinformation on social networks. Plus d'infos

ANR ESSL Project

1 January 2023

Efficient Self-Supervised Learning for Inclusive and Innovative Speech Technologies Self-Supervised Learning (SSL) has recently emerged as an incredibly promising artificial intelligence (AI) method. Through this method, massive amounts of unlabeled data that are accessible can be utilized by AI systems to surpass known performances. Particularly, the field of Automatic Speech Processing (ASP) is swiftly being transformed by the arrival of SSL, thanks in part to massive industrial investments and the explosion of data, both provided by a handful of companies. The performance gains are impressive, but the complexity of SSL models requires researchers and industry professionals in the field to have extraordinary computational capacity, drastically limiting access to fundamental research in this area and its deployment in everyday products. For instance, a significant portion of work using an SSL model for ASP relies on a system maintained and provided by a single company (wav2vec 2.0). The entire lifecycle of the technology, from its theoretical foundations to its practical deployment and societal analysis, therefore depends solely on institutions with the physical and financial means to support the intensity of this technique’s development. The E-SSL project aims to restore to the scientific community and ASP industry the necessary control over self-supervised learning Plus d'infos

ANR BRUEL Project

1 January 2023

Development of a methodology for evaluating voice identification systems The BRUEL project concerns the evaluation/certification of voice identification systems against adversarial attacks. Indeed, speaker recognition systems are vulnerable not only to speech artificially produced by voice synthesis but also to other forms of attacks such as voice identity conversion and replay attacks. The artifacts created during the creation or manipulation of these fraudulent attacks leave marks in the signal by voice synthesis algorithms, thus distinguishing the original real voice from a forged voice. Under these conditions, detecting identity theft requires evaluating identity theft countermeasures concurrently with speaker recognition systems. The BRUEL project aims to propose the first methodology for evaluating/certifying voice identification systems based on a Common Criteria approach. List of partners: CEA Eurecom Service National de Police Scientifique IRCAM LIA Laboratoire d’Informatique d’Avignon Project Coordinator: LIA Scientific Manager for LIA: Driss Matrouf Start Date: 01/01/2023 End Date: 30/06/2026 More

ANR EVA Project (SLG)

1 January 2023

Explicit Voice Attributes Describing a voice in a few words remains a very arbitrary task. We can speak with a “deep”, “breathy”, “bright” or “hoarse” voice, but the full characterization of a voice would require a close set of rigorously defined attributes constituting an ontology. However, such a description grid does not exist. Machine learning applied to speech also suffers the same weakness : in most automatic processing tasks, when a speaker is modeled, abstract global representations are used without making their characteristics explicit. For instance, automatic speaker verification / identification is usually tackled thanks to the x-vectors paradigm, which consists in describing a speaker’s voice by an embedding vector only designed to distinguish speakers. Despite their very good accuracy for speaker identification, x-vectors are usually unsuitable to detect similarities between different voices with common characteristics. The same observations can be made for speech generation. We propose to carry out a comprehensive set of analyses to extract salient, unaddressed voice attributes to enrich structured representations usable for synthesis and voice conversion. Partner list: Project leader: Orange Scientific leader for LIA: Yannick Estève Start date: 01/01/2023 — End date: 31/12/2025 More

PhD defense of Mathias Quillot – 27 September 2022

27 September 2022

Date: Tuesday, September 27th at 10:00 am in S5. Title: A first step towards characterizing the information conveyed by acted voices Abstract: Before being distributed in different countries, a work such as a video game or a film needs adaptation. Subtitling and dubbing are two options for adapting a work. While subtitling is less costly to produce, dubbing better suits certain viewers who prefer to listen to the dialogue, usually in their native language, rather than reading subtitles while listening to dialogue in another language. To dub a work, the first step is to select actors from a pool of candidates whose voices will replace the original ones. This selection process is called Voice Casting and is conducted by the Artistic Director (AD), sometimes referred to as the casting director. With the emergence of new streaming platforms such as Disney+ and Amazon Prime and the tremendous growth in the video game industry, the number of works to be distributed internationally is significantly increasing. In response to this demand, more and more actors are available in the voice market. The AD may miss out on talents that are unknown to them as it is impossible to audition all candidates. Tools for Plus d'infos

H2020 SELMA Project

1 January 2021

Stream Learning for Multilingual Knowledge Transfer The internet contains vast amounts of data and information in various languages, both written and audiovisual. There’s an increasing need to leverage this largely untapped resource. The SELMA project, funded by the EU, focuses on ingesting and monitoring large quantities of data. It systematically trains machine learning models to perform tasks in natural language and utilizes these models to monitor data streams, aiming to enhance multilingual media monitoring and real-time content production. Ultimately, the project will advance cutting-edge techniques in language modeling, automatic translation, speech recognition, and synthesis. Project Coordinator: Deutsche Welle, DE Scientific Lead for LIA: Yannick ESTEVE Start Date: 01/01/2021 End Date: 30/12/2023 More

ANR muDialBot Project

1 January 2021

MUlti-party perceptually-active situated DIALog for human-roBOT interaction In muDialBot, our ambition is to proactively incorporate human-like behavioral traits in human-robot spoken communication. We aim to reach a new stage in harnessing the rich information provided by audio and visual data streams from humans. In particular, extracting verbal and non-verbal events should enhance the decision-making abilities of robots to manage turns of speech more naturally and also switch from group interactions to face-to-face dialogues according to the situation. There has been growing interest recently in companion robots capable of assisting individuals in their daily lives and effectively communicating with them. These robots are perceived as social entities, and their relevance to health and psychological well-being has been highlighted in studies. Patients, their families, and healthcare professionals will better appreciate the potential of these robots as certain limitations are quickly overcome, such as their ability to move, see, and listen to communicate naturally with humans, beyond what touchscreen displays and voice commands already enable. The scientific and technological outcomes of the project will be implemented on a commercial social robot and tested and validated with multiple use cases in the context of a day hospital unit. Large-scale data collection will complement in-situ Plus d'infos

H2020 ESPERANTO Project

1 January 2021

Exchanges for SPEech ReseArch aNd TechnOlogies Speech processing technologies are crucial for numerous commercial applications. The ESPERANTO project, funded by the EU, aims to make the next generation of AI algorithms used in speech processing applications more accessible. For instance, they should consider human involvement and be interpretable to allow sensitive applications and safeguard personal data. ESPERANTO envisions disseminating these technologies across European SMEs, expanding and ensuring their implementation for forensic, healthcare, and educational purposes. The project will support the development of freely accessible tools, conduct seminars on various speech processing themes to assist new students, researchers, and engineers working in speech AI, and contribute to the collection and sharing of linguistic and speech-related resources. Project Coordinator: University of Le Mans, FR Scientific Manager for LIA: Jean-François Bonastre Start Date: 01/01/2021 End Date: 30/06/2025 More

HDR defense of Richard Dufour – 8 December 2020

8 December 2020

Defense of the HDR entitled ‘Natural Language Processing: Studies and Contributions at the Frontiers of Interdisciplinarity’, on Tuesday, December 8, 2020, at 2:00 PM in the Thesis Room of Avignon University (Hannah Arendt Campus – City Center). The defense committee will be composed of:

« 1 2 3 4 »