Wav2Vec2-Large-XLSR-53 The base model pretrained and fine-tuned on 960 hours of Librispeech on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz. More Info Meta AI Research post:Wav2vec 2.0: Learning the structure of speech from...
We also release multilingual pre-trained wav2vec 2.0 (XLSR) models: ModelArchitectureHoursLanguagesDatasetsModel XLSR-53Large56k53MLS, CommonVoice, BABELdownload The XLSR model uses the following datasets for multilingual pretraining: MLS: Multilingual LibriSpeech(8 languages, 50.7k hours):Dutch, Eng...