ANR AISSPER Project

AISSPER: Artificial Intelligence for Semantically controlled SPEech undeRstanding

Artificial Intelligence (AI) holds strategic importance at the national level due to impressive outcomes achieved by deep learning algorithms in various domains such as natural language processing (NLP), medicine, and political analytics across a wide range of applications. France has emerged as a leader in deep learning owing to recent political efforts highlighted in recent years. Over the last decade, substantial efforts have been dedicated to end-to-end Spoken Language Understanding (SLU) systems, driven by the feasibility of applications like personal assistants and conversational systems. Superior results have been observed in automatic speech recognition (ASR) with architectures based on hyper-complex number algebra called quaternions, requiring less processing time (Morchid 2018) and fewer parameters to estimate compared to conventional models (Parcollet et al 2018; 2019). Reducing model parameters efficiently trains neural architectures with limited data quantities, often challenging to obtain for specific semantic concepts and contexts from specific domains. Intrinsically linked learning processes like ASR and SLU hinder the parallelization of learning examples, critical for lengthy sequences as memory constraints limit batch processing using examples. Furthermore, error analysis conducted on completed projects like M2CR, JOKER, VERA, SUMACC, Media, or DECODA highlighted the importance of leveraging prior domain knowledge for semantic interpretation. AISSPER will explore novel attention models using semantics to focus on specific contextual information, thereby enhancing concept classification.

AISSPER Context: Effectively modeling variability, as seen in human language at phoneme, word, and sentence levels, remains a highly open research problem in NLP. Extracting relevant keywords, topics, or concepts from a full sentence or oral document remains challenging, even for state-of-the-art end-to-end systems. Additionally, for speech signals, recording conditions and the lack of domain-specific data make it challenging to extract relevant information across different contexts without utilizing external knowledge, such as domain-specific ontologies. Another crucial issue for available solutions is the interpretability and robustness of systems. Addressing these concerns in SLU systems was recently discussed at the IRASL workshop during NIPS 2018. Therefore, it is proposed to highlight relevant appropriate contexts and relative uncertainty to enhance interpretation and robustness.

AISSPER aims to develop new semantic models at the sentence and conversation levels for extracting relevant information from spoken documents. Specifically, AISSPER will develop new neural attention mechanisms to enhance end-to-end neural SLU systems at both the sentence and document levels. To achieve this, AISSPER will foster strong collaboration among established researchers from multiple disciplines: automatic speech processing and machine learning from LIUM, LIA (academia), and Orkis (industry). Additionally, AISSPER will continue leveraging collaboration between LIUM and MILA focused on automatic translation in the European/Canadian M2CR project, and between LIA and MILA for quaternion neural network development.

List of Partners:

Project Coordinator: LIA

Scientific Manager for LIA: Mohamed MORCHID

Start Date: 01/01/2020 End Date: 30/09/2024

More