Stream Learning for Multilingual Knowledge Transfer
The internet contains vast amounts of data and information in various languages, both written and audiovisual. There’s an increasing need to leverage this largely untapped resource. The SELMA project, funded by the EU, focuses on ingesting and monitoring large quantities of data. It systematically trains machine learning models to perform tasks in natural language and utilizes these models to monitor data streams, aiming to enhance multilingual media monitoring and real-time content production. Ultimately, the project will advance cutting-edge techniques in language modeling, automatic translation, speech recognition, and synthesis.