cuiyongchao opened this issue Mar 19, 2024· 1 comment Comments cuiyongchao commented Mar 19, 2024 CUDA_VISIBLE_DEVICES=0,1,2,3,4 python3 -m vllm.entrypoints.openai.api_server --served-model-name Qwen1.5-72B-Chat --model /data/models/Qwen1.5-72B-Chat --host 0.0.0.0 --port 8089 出...
期望可以通过CUDA_VISIBLE_DEVICES=0,1,2,3,...参数设置多张GPU卡, python3 -m qanything_kernel.qanything_server.sanic_api --host 0.0.0.0 --port 8777 --model_size 7B 可以正常运行 运行环境 | Environment -OS:Ubuntu22.04.4 LTS-NVIDIA Driver: 550.54.14-CUDA:12.4-docker: 纯Python环境安装-dock...
假设机器有四张卡:gpu0,gpu1,gpu2,gpu3os.environ['CUDA_VISIBLE_DEVICES']='0,1,2' # 这句话意思是对于torch来说,只有gpu0,gpu1,gpu2三个gpu是可见的,gpu3不可见os.environ['CUDA_VISIBLE_DEVICES']='1,2' 对于torch来说可见的只有1号和2号卡,并且gpu1为主卡注意“os.environ[……]”这句话要...
cuiyongchao opened this issue Mar 19, 2024· 1 comment Comments cuiyongchao commented Mar 19, 2024 CUDA_VISIBLE_DEVICES=0,1,2,3,4 python3 -m vllm.entrypoints.openai.api_server --served-model-name Qwen1.5-72B-Chat --model /data/models/Qwen1.5-72B-Chat --host 0.0.0.0 --port 8089 出...