Paper accepted EAMT2024
We are glad to announce the acceptance of our paper Bootstrapping Pre-trained Word Embedding Models for Sign Language Gloss Translation at Conference of the European Association of Machine Translation (EAMT) in Sheffield this June 24th-27th (https://eamt2024.sheffield.
We have modified pre-trained word embeddings for spoken languages in order to represent more faithfully the meanings of signs in four sign languages to use these embedding representations in machine translation experiments, with overall positive results.
Bootstrapping Pre-trained Word Embedding Models for Sign Language Gloss Translation
Euan McGill, Universitat Pompeu Fabra - Luis Chiruzzo, Universidad de la Republica - Horacio Saggion, Universitat Pompeu Fabra
Abstract
This paper explores a novel method to modify existing pre-trained word embedding models of spoken languages for Sign Language glosses. These newly- generated embeddings are described, visualised, and then used in the encoder and/or decoder of models for the Text2Gloss and Gloss2Text task of machine translation.
In two translation settings (one including data augmentation-based pre-training and a baseline), we find that bootstrapped word embeddings for glosses improve translation across four Signed/spoken language pairs. Many improvements are statistically significant, including those where the bootstrapped gloss embedding models are used.
Languages included: American Sign Language, Finnish Sign Language, Spanish Sign Language, Sign Language of The Netherlands.