ANR EVA Project

1 January 2023

Explicit Voice Attributes Describing a voice in a few words remains a very arbitrary task. We can speak with a “deep”, “breathy”, “bright” or “hoarse” voice, but the full characterization of a voice would require a close set of rigorously defined attributes constituting an ontology. However, such a description grid does not exist. Machine learning applied to speech also suffers the same weakness : in most automatic processing tasks, when a speaker is modeled, abstract global representations are used without making their characteristics explicit. For instance, automatic speaker verification / identification is usually tackled thanks to the x-vectors paradigm, which consists in describing a speaker’s voice by an embedding vector only designed to distinguish speakers. Despite their very good accuracy for speaker identification, x-vectors are usually unsuitable to detect similarities between different voices with common characteristics. The same observations can be made for speech generation. We propose to carry out a comprehensive set of analyses to extract salient, unaddressed voice attributes to enrich structured representations usable for synthesis and voice conversion. Partner list: Project leader: Orange Scientific leader for LIA: Yannick Estève Start date: 01/01/2023 — End date: 31/12/2025 More

ANR Project VoicePersonae

1 February 2019

With recent advancements in automatic speech and language processing, humans are increasingly interacting vocally with intelligent artificial agents. The use of voice in applications is expanding rapidly, and this mode of interaction is becoming more widely accepted. Nowadays, vocal systems can offer synthesized messages of such quality that discerning them from human-recorded messages is difficult. They are also capable of understanding requests expressed in natural language, albeit within their specific application framework. Furthermore, these systems frequently recognize or identify their users by their voices. Plus d'infos