PhD defense of Thibault Cordier – 13 October 2023

13 October 2023

Date: Friday, the 13th of October at 9 am, Place: room “salle des thèses” at l’Université d’Avignon, Campus Hannah Arendt (centre-ville). Title: « Hierarchical Imitation and Reinforcement Learning for Multi-Domain Task-Oriented Dialogue Systems ». The defense can be followed through the live link below: https://v-au.univ-avignon.fr/live Abstract: In this Ph.D thesis, we study task-oriented dialogue systems that are systems designed to assist users in completing specific tasks, such as booking a flight or ordering food. They typically rely on reinforcement learning paradigm to model the dialogue that allows the system to reason about the user’s goals and preferences, and to select actions that will lead to the desired outcome. Our focus is specifically on learning from a limited number of interactions that is crucial due to the scarcity and costliness of human interactions. Standard reinforcement learning algorithms typically require a large amount of interaction data to achieve good performance. To address this challenge, we aim to make dialogue systems more sample-efficient in their training. We draw from two main ideas: imitation and hierarchy. Our first contribution explores the integration of imitation with reinforcement learning. We investigate how to effectively use expert demonstrations to extrapolate knowledge with minimal generalisation effort. Our second contribution focuses on Plus d'infos