ANR DEEP-PRIVACY Project

Distributed, Personalized, Privacy-Preserving Learning for Speech Processing

The project focuses on developing distributed, personalized, and privacy-preserving approaches for speech recognition. We propose an approach where each user’s device locally performs private computations and does not share raw voice data, while certain inter-user computations (such as model enrichment) are conducted on a server or a peer-to-peer network, with voice data shared after anonymization.

Objectives: Speech recognition is now used in numerous applications, including virtual assistants that collect, process, and store personal voice data on centralized servers, raising serious privacy concerns. The use of embedded speech recognition addresses these privacy aspects, but only during the speech recognition phase. However, there is still a need to further improve speech recognition technology as its performance remains limited in adverse conditions (e.g., noisy environments, reverberant speech, strong accents, etc.). This can only be achieved from large speech corpora representative of real and diverse usage conditions. Hence, there is a necessity to share voice data while ensuring privacy. Improvements obtained through shared voice data will then benefit all users. <br /><br />In this context, DEEP-PRIVACY proposes a new paradigm based on a distributed, personalized, and privacy-preserving approach. Some processing occurs on the user’s terminal, ensuring privacy and allowing personalized treatments for performance optimization. Regarding voice data to be shared on a server or peer-to-peer network, they must be anonymized before sharing. This defines the project objectives: the first concerns learning representations of voice signals while preserving privacy, while the second focuses on distributed algorithms and personalization.

List of Partners:

Project Coordinator: INRIA Grand Est

Scientific Manager for LIA: Yannick ESTEVE

Start Date: 01/01/2019 End Date: 30/06/2023

More