Wav2Vec2-Large-XLSR-53 The base model pretrained and fine-tuned on 960 hours of Librispeech on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz. More Info Meta AI Research post:Wav2vec 2.0: Learning the structure of speech from...
Issues 开源项目>人工智能>大模型 Watch 1Star0Fork0 modelee/wav2vec2-large-xlsr-53-esperanto 全部 看板 里程碑 新建Issue 欢迎使用 Issue! Issue 用于跟踪待办事项、bug、功能需求等。在使用之前,请先创建一个 Issue。
请注意,该模型应该在下游任务上进行微调,例如自动语音识别。 jonatasgrosman/wav2vec2-large-xlsr-53-english是基于 Wav2Vec2 XLSR 的流行的英语微调 CTC 模型,其他语言的模型也已得到训练。 如果您想无论如何使用模型的输出(仅从音频获取特征向量),请使用 aWav2Vec2FeatureExtractor代替Wav2Vec2Processor,并Wav...
Wav2Vec2-Large-XLSR-Persian-ASR / README.mdLatest commit HistoryHistory File metadata and controls Preview Code Blame 3 lines (2 loc) · 170 Bytes Raw Wav2Vec2-Large-XLSR-Persian-ASR visit https://huggingface.co/lnxdx/Wav2Vec2-Large-XLSR-Persian-ShEMO...
Fine_Tune_Wav2Vec2_Large_XLSR_on_ShEMO_for_Persian_ASR_with_🤗_Transformers.ipynb Rename Fine_Tune_Wav2Vec2_Large_XLSR_on_ShEMO_for_Persian_ASR_with_🤗_… Mar 15, 2024 LICENSE Initial commit Mar 15, 2024 README.md Update README.md Mar 15, 2024 ...
CommonVoice (36 languages, 3.6k hours): Arabic, Basque, Breton, Chinese (CN), Chinese (HK), Chinese (TW), Chuvash, Dhivehi, Dutch, English, Esperanto, Estonian, French, German, Hakh-Chin, Indonesian, Interlingua, Irish, Italian, Japanese, Kabyle, Kinyarwanda, Kyrgyz, Latvian, Mongolian,...
Too Long; Didn't ReadThis guide explains the steps to finetune Meta AI's wav2vec2 XLS-R model for automatic speech recognition ("ASR"). The guide includes step-by-step instructions on how to build a Kaggle Notebook that can be used to finetune the model. The model...
Too Long; Didn't ReadThis guide explains the steps to finetune Meta AI's wav2vec2 XLS-R model for automatic speech recognition ("ASR"). The guide includes step-by-step instructions on how to build a Kaggle Notebook that can be used to finetune the model. The model is trained o...
In contrast to most NLP models, XLSR-Wav2Vec2 has a much larger input length than output length. E.g., a sample of input length 50000 has an output length of no more than 100. Given the large input sizes, it is much more efficient to pad the training batches dynamicall...
Wav2Vec 2.0 Large (LV-60 + CV + SWBD + FSH) **300 hours SwitchboardLibri-Light+CommonVoice+Switchboard+Fisherdownload * updated (Oct. 24, 2020) ** updated (Nov. 13, 2021) We also release multilingual pre-trained wav2vec 2.0 (XLSR) models: ...