![]() ![]() Learning on a target language, generate pseudo-labels for that language, and train a final model using With low-resource languages: train a supervised multilingual model, fine-tune it with semi-supervised We propose a simple pseudo-labeling recipe that works well even In this work, we extend pseudo-labeling to massively multilingual speech Semi-supervised learning through pseudo-labeling has become a staple of state-of-the-art monolingual The abstract from the paper is the following: The model takes as input Mel filterbank features from a 16Khz audio signal. The labels are unnormalized character-level transcripts (punctuation and capitalization are not removed). After training on Common Voice and VoxPopuli, the model is trained on Common Voice only. It is trained on Common Voice (version 6.1, December 2020 release) and VoxPopuli. The model is a 1B-param transformer encoder, with a CTC head over 8065 character labels and a language identification head over 60 language ID labels. The M-CTC-T model was proposed in Pseudo-Labeling For Massively Multilingual Speech Recognition by Loren Lugosch, Tatiana Likhomanenko, Gabriel Synnaeve, and Ronan Collobert. You can do so by running the following command: pip install -U transformers=4.30.0. If you run into any issues running this model, please reinstall the last version that supported this model: v4.30.0. This model is in maintenance mode only, so we won’t accept any new PRs changing its code. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |