2024.04.22: Support for inference and fine-tuning of Llama3 GPTQ-Int4, GPTQ-Int8, and AWQ series models. Support for inference and fine-tuning of chatglm3-6b-128k, Openbuddy-Llama3. 2024.04.20: Support for inference, fine-tuning, and deployment of Atom series models. This includes: ...
2024.04.22: Support for inference and fine-tuning of Llama3 GPTQ-Int4, GPTQ-Int8, and AWQ series models. Support for inference and fine-tuning of chatglm3-6b-128k, Openbuddy-Llama3. 2024.04.20: Support for inference, fine-tuning, and deployment of Atom series models. This includes: ...
2024.05.22: Supports TeleChat-12B-v2 model with quantized version, model_type are telechat-12b-v2 and telechat-12b-v2-gptq-int4 🔥2024.05.21: Inference and fine-tuning support for MiniCPM-Llama3-V-2_5 are now available. For more details, please refer to minicpm-v-2.5 Best Practice...
2024.04.22: Support for inference and fine-tuning of Llama3 GPTQ-Int4, GPTQ-Int8, and AWQ series models. Support for inference and fine-tuning of chatglm3-6b-128k, Openbuddy-Llama3. 2024.04.20: Support for inference, fine-tuning, and deployment of Atom series models. This includes: ...
2024.04.22: Support for inference and fine-tuning of Llama3 GPTQ-Int4, GPTQ-Int8, and AWQ series models. Support for inference and fine-tuning of chatglm3-6b-128k, Openbuddy-Llama3. 2024.04.20: Support for inference, fine-tuning, and deployment of Atom series models. This includes: ...
2024.04.22: Support for inference and fine-tuning of Llama3 GPTQ-Int4, GPTQ-Int8, and AWQ series models. Support for inference and fine-tuning of chatglm3-6b-128k, Openbuddy-Llama3. 2024.04.20: Support for inference, fine-tuning, and deployment of Atom series models. This includes: ...
+ 修改`open_api_server.py` 中模型路径 `MODEL_PATH`,可选择构建 GLM-4-9B-Chat 或者 GLM-4v-9B 服务端 启动服务端: 1 change: 1 addition & 0 deletions 1 basic_demo/README_en.md Original file line numberDiff line numberDiff line change @@ -126,6 +126,7 @@ python vllm_cli_dem...
2024.04.22: Support for inference and fine-tuning of Llama3 GPTQ-Int4, GPTQ-Int8, and AWQ series models. Support for inference and fine-tuning of chatglm3-6b-128k, Openbuddy-Llama3. 2024.04.20: Support for inference, fine-tuning, and deployment of Atom series models. This includes: ...
4 changes: 3 additions & 1 deletion 4 tests/llm/test_run2.py Original file line numberDiff line numberDiff line change @@ -105,7 +105,9 @@ def test_glm4v_9b_chat(self): lazy_tokenize=False)) best_model_checkpoint = output['best_model_checkpoint'] torch.cuda.empty_cache() infer...
2024.04.22: Support for inference and fine-tuning of Llama3 GPTQ-Int4, GPTQ-Int8, and AWQ series models. Support for inference and fine-tuning of chatglm3-6b-128k, Openbuddy-Llama3. 2024.04.20: Support for inference, fine-tuning, and deployment of Atom series models. This includes: ...