CTranslate2 is a C++ and Python library for efficient inference with Transformer models. The project implements a custom runtime that applies many performance optimization techniques such as weights quantization, layers fusion, batch reordering, etc., to accelerate and reduce the memory usage of Trans...
Fast inference engine for Transformer models. Contribute to OpenNMT/CTranslate2 development by creating an account on GitHub.
· 2 commits to master since this release v4.5.0 383d063 Note: The Ctranslate2 Python package now supports CUDNN 9 and is no longer compatible with CUDNN 8. New features Support Phi3 (#1800) Support Mistral Nemo (#1785) Support Wav2Vec2Bert ASR (#1778) Fixes and improvements Upgrade...
OpenNMT/CTranslate2最新发布版本:v4.2.1(2024-04-24 18:04:01)New features Support Flash Attention (#1651) Implementation of gemm for FLOAT32 compute type with RUY backend (#1598) Conv1D quantization for only CPU (DNNL and CUDA backend is not supported) (#1601) Fixes and improvements Fix ...
1.使用Ctranslate2进行模型转换 首先到opennmt官网下载Ctranslate2,激活opennmt环境 ct2-opennmt-py-converter --model_path /home//OpenNMT-py//general_domain_zh_fr/model/model_step_200000.pt --output_dir /home//OpenNMT-py//basic/basic_zh_fr_model/convert1028 ...
脚本中的max_size似乎也缩短了实际的built_batches,因此在测试中max_size的顺序是递减的。这里是:
Start using CTranslate2 from Python by converting a pretrained model and running your first translation. 1. Install the Python packages pip install ctranslate2 OpenNMT-py==2.* sentencepiece 2. Download the English-German Transformer model trained with OpenNMT-py wget https://s3.amazonaws.com/op...
OpenNMT/CTranslate2最新发布版本:v4.3.0(2024-05-17 16:20:20)New features Support conversion of GPT-NeoX models with the Transformers converter Extend the end_token argument to also accept a list of tokens Add option return_end_token to include the end token in the results of the methods ...
OpenNMT/CTranslate2最新发布版本:v4.2.1(2024-04-24 18:04:01)New features Update the Transformers converter with new architectures: CodeGen GPTBigCode LLaMa MPT Update the OpenNMT-py converter to support some recent options: layer_norm="rms" max_relative_positions=-1 (rotary embeddings) max...
docker run --rm ghcr.io/opennmt/ctranslate2:latest-ubuntu20.04-cuda11.2 --help To update to the new version that supports CUDA 12. The Docker image supports GPU execution. Install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/overview.html) ...