Fast inference engine for Transformer models. Contribute to OpenNMT/CTranslate2 development by creating an account on GitHub.
OpenNMT/CTranslate2最新发布版本:v4.3.0(2024-05-17 16:20:20)New features Support conversion of GPT-NeoX models with the Transformers converter Extend the end_token argument to also accept a list of tokens Add option return_end_token to include the end token in the results of the methods ...
OpenNMT/CTranslate2最新发布版本:v4.2.1(2024-04-24 18:04:01)New features Update the Transformers converter with new architectures: CodeGen GPTBigCode LLaMa MPT Update the OpenNMT-py converter to support some recent options: layer_norm="rms" max_relative_positions=-1 (rotary embeddings) max...
1.使用Ctranslate2进行模型转换 首先到opennmt官网下载Ctranslate2,激活opennmt环境 ct2-opennmt-py-converter --model_path /home//OpenNMT-py//general_domain_zh_fr/model/model_step_200000.pt --output_dir /home//OpenNMT-py//basic/basic_zh_fr_model/convert1028 1. 转换后会生成一个convert文件夹,这...
Note: The Ctranslate2 Python package now supports CUDNN 9 and is no longer compatible with CUDNN 8. New features Support Phi3 (#1800) Support Mistral Nemo (#1785) Support Wav2Vec2Bert ASR (#1778) Fixes and improvements Upgrade to CUDNN9 (#1803) Fix logits vocab (#1786 + #1791) ...
脚本中的max_size似乎也缩短了实际的built_batches,因此在测试中max_size的顺序是递减的。这里是:
OpenNMT/CTranslate2最新发布版本:v4.2.1(2024-04-24 18:04:01)New features Support T5 models, including the variants T5v1.1 and mT5 Support loading the model files from memory: Python: see the files argument in the constructor of classes loading models C++: see the models::ModelMemoryReader...
OpenNMT/CTranslate2最新发布版本:v4.1.1(2024-03-12 16:59:56)This major version introduces the breaking change while updating to cuda 12. Breaking changes Python Support cuda 12 New features Add feature to_device() in class StorageView in Python to move data between host <-> device Fixes ...
@ymoslem, I haven't compiled ctranslate2 to test it yet, and I'm waiting for the release. However, there seems to be an issue with Gemma-it, compared to Mistral. The situation strengthens with quantization. You can try using a repetition penalty, but overall, I've observed this problem...
The CTranslate2 version loading the model should not be older than the version that converted the model. For example a new model is converted with CTranslate2 3.17.0, but the production server is still using an older version 3.15.0. There is no guarantee that this new model can be ...