Pull requests Actions Projects Security Insights Additional navigation options Files main vllm_nccl .gitignore LICENSE README.md setup.py Latest commit Cannot retrieve latest commit at this time. History History Breadcrumbs vllm-nccl / File metadata and controls ...
Repository files navigation README Apache-2.0 license NOTE: This repo is deprecated with this fix to the main vLLM repo. vllm-nccl Manages vllm-nccl dependency Define package_name, nccl_version, vllm_nccl_verion run python setup.py sdist run twine upload dist/*About...
.join([nccl_version, vllm_nccl_verion])assert nccl_version == "2.18.1", f"only support nccl 2.18.1, got {version}"url = f"https://storage.googleapis.com/vllm-public-assets/nccl/{cuda_name}/libnccl.so.{nccl_version}" url = f"https://github.com/vllm-project/vllm-nccl/releases...
environ["VLLM_INSTALL_NCCL"].split("+")assert nccl_major_version in ["2.20", "2.18", "2.17", "2.16"], f"Unsupported nccl major version: {nccl_major_version}"assert cuda_major_version in ["11", "12"], f"Unsupported cuda major version: {cuda_major_version}"...
# the `vllm_nccl` package must be installed from source distribution # pip is too smart to store a wheel in the cache, and other CI jobs # will directly use the wheel from the cache, which is not what we want. # we need to remove it manually RUN --mount=type=cache,target=/root...
I'm trying to load model into LLM(model="meta-llama/Llama-2-7b-chat-hf") and I'm getting the error below DistBackendError: NCCL error in: ../torch/csrc/distributed/c10d/NCCLUtils.hpp:219, invalid argument, NCCL version 2.14.3 ncclInvalid...
Your current environment vllm 0.4.0.post1 docker image how ran: docker run -d \ --runtime=nvidia \ --gpus '"device=0,1"' \ --shm-size=10.24gb \ -p 5002:5002 \ -e NCCL_IGNORE_DISABLED_P2P=1 \ -v /etc/passwd:/etc/passwd:ro \ -v /etc/group:...
NCCL version 2.20.5+cuda11.0 INFO 08-06 18:38:36 custom_all_reduce_utils.py:232] reading GPU P2P access cache from /home/wjc/.cache/vllm/gpu_p2p_access_cache_for_0,1.json (VllmWorkerProcess pid=3615391) INFO 08-06 18:38:36 custom_all_reduce_utils.py:232] reading GPU P2P access...
the version of nccl in environment is "nvidia-nccl-cu11 2.20.5" 3、 when deploy LLM model by vllm, there have Errors as follows: 2024-06-18 22:03:51 | INFO | stdout | (RayWorkerWrapper pid=1043334) ERROR 06-18 22:03:50 worker_base.py:148] File "/opt/anaconda3/envs/vllm4/...
Your current environment The output of `python collect_env.py` Your output of `python collect_env.py` here vllm 0.5.4 🐛 Describe the bug 目前在8 * A800上进行推理,vllm推理的40b fp8版本,然后tp=8,一开始能推理,试了几个case以后就通信异常了 INFO 08-22 0