PhD thesis defense of Arthur Amalvy – 12/09/2024
Thesis title: Natural Language Processing for the Representation of Narrative Texts through Character Networks Date: 12/09/2024 – 9 AM Place: CERI’s Ada Lovelace amphitheater. Abstract: A character network represents characters as vertices in a graph, and their relationships as edges between them. In the case of literary works, they model a whole narrative using a single mathematical object. Depending on the needs, their edges can represent different types of interactions between characters: co-occurrence, conversation, direct action… Additionally, the temporal changes in the relationships between characters can be modeled with dynamic networks. Thanks to this flexibility, character networks have been used to tackle a number of tasks, such as literary genre classification, story segmentation, recommendation or summarization. Manually extracting these networks is costly, which is why many researchers interested in automating the process. This, in turn, requires solving different Natural Language Processing (NLP) tasks such as Named Entity Recognition (NER), coreference resolution or speaker attribution. In this thesis, we present contributions to this automatic extraction process in the case of novels, as well as to character network applications. Inspired by the 2019 survey of Labatut and Bost that summarizes existing extraction efforts in a generic extraction framework, we propose Renard, a Plus d'infos