(1)使用华为昇腾NPU推理部署DeepSeek 参考博客:华为昇腾推理DeepSeek-R1,性能比肩高端GPU,API免费无限量!潞晨自研推理引擎出手了 来自华为昇腾社区的 MindIE 框架成功适配了 DeepSeek-V3 的 BF16 版本。 有关Ascend NPU 的分步指南,请按照此处的说明进行操作。 (2)使用TRT-LLM推理部署DeepSeek GitHub地址:https:/...
MOD: Update deepseek-r1 prompt style to latest modification(2025/02/09). Feb 11, 2025 build.sh Initial commit Aug 21, 2024 Repository files navigation README Apache-2.0 license grps-trtllm GRPS + TensorRT-LLM 实现纯C++版高性能OpenAI LLM服务,支持Chat、Ai-agent、Multi-modal等。 快速开始 | ...
48 48 | DeepSeek-R1-DistillTinyR1-32B-Preview | deepseek-r1 | ✅ | ❌ | [deepseek-r1-distill](docs%2Fdeepseek-r1-distill.md) | 49 - | QwQ | qwq | ✅ | ❌ | [qwq](docs%2Fqwq.md) | 49 + | QwQ-32B | qwq | ✅ | ✅ | [qwq](docs%2Fqwq.md) | 50 +...
llm_style: deepseek-r1 # tokenizer config. tokenizer_type: huggingface # can be `huggingface`, `sentencepiece`. Must be set. 2 changes: 1 addition & 1 deletion 2 conf/inference_deepseek-r1-distill-qwen.yml Original file line numberDiff line numberDiff line change @@ -8,7 +8,7 @@ ...
|DeepSeek-R1-DistillTinyR1-32B-Preview|deepseek-r1|✅|❌|[deepseek-r1-distill](docs%2Fdeepseek-r1-distill.md)| 49- |QwQ-32B|qwq|✅|✅|[qwq](docs%2Fqwq.md)| 49+ |QwQ-32BQwQ-32B-AWQ|qwq|✅|✅|[qwq](docs%2Fqwq.md)| 5050 |QwQ-...