(64-bit runtime) Python platform: Linux-4.15.0-39-generic-x86_64-with-glibc2.31 Is CUDA available: True CUDA runtime version: Could not collect CUDA_MODULE_LOADING set to: LAZY GPU models and configuration: GPU
Your current environment The output of `python collect_env.py` 🐛 Describe the bug start vllm with env export VLLM_USE_MODELSCOPE=True, got errors: INFO 07-24 08:44:25 model_runner.py:680] Starting to load model LLM-Research/Meta-Llama-3...
1. Apply patch 2. VLLM_USE_MODELSCOPE=true vllm serve Qwen/Qwen2.5-0.5B-Instruct 3. log show: Downloading Model to directory: /root/.cache/modelscope/hub/Qwen/Qwen2.5-0.5B-Instruct Contributor Author r4ntix commented Feb 17, 2025 I test it by patching this PR, it works. Thanks. ...
🎁 2025.02.16: Support LMDeploy in GRPO, use --use_lmdeploy true. Please check this script 🔥 2025.02.12: Support for GRPO(Group Relative Policy Optimization) algorithm for llm and mllm, document can be found in here 🎁 2025.02.10: SWIFT support the fine-tuning of embedding models...