针对你遇到的“vllm runtimeerror: failed to infer device type”错误,这里有几个可能的解决步骤: 确认错误信息的完整内容和上下文: 错误信息通常会包含更多细节,这些细节对于诊断问题至关重要。请确保你查看了完整的错误输出,并理解错误的上下文。如果可能,提供完整的错误输出将有助于更准确地定位问题。 检查代码中
Perhaps you can try upgrading the vLLM version. It's possible that this issue has been fixed since v0.5.5. Yeah, I tried that with v9.6.2 (posted the results above). Getting “Failed to infer device type” exception. I will run the python script in the container and post here, as...
Your current environment The output of `python collect_env.py` python -m vllm.entrypoints.openai.api_server --host 0.0.0.0 --port 12345 --max-model-len 65536 --trust-remote-code --tensor-parallel-size 8 --quantization moe_wna16 --gpu-mem...
使用mindie1.0.0拉起Qwen2-VL模型推理服务,运行./bin/mindieservice_daemon显示报错LLMInferEngine failed to init LLMInferModels。根据mindie文档查看1.0.0版本应该是支持Qwen2VL模型。 二、软件版本: -- CANN 版本 : 8.0.0 --Pytorch版本: 2.1.0 --Python 版本 : 3.11.6 --mindie 版本:1.0.0 三、日志信...
LLMInferEngine failed to init LLMInferModels ERR: Failed to init endpoint! Please check the service log or console output. Killed 二、软件版本: -- MindIE镜像版本信息:swr.cn-south-1.myhuaweicloud.com/ascendhub/mindie:1.0.0-800I-A2-py311-openeuler24.03-lts ...
Apps that use facial expressions or facial movements to infer emotional states, such as anger, disgust, happiness, sadness, surprise, fear, or other terms commonly used to describe the emotional state of a person can be restricted based on the review. Use of facial expressions...
Apps that use facial expressions or facial movements to infer emotional states, such as anger, disgust, happiness, sadness, surprise, fear, or other terms commonly used to describe the emotional state of a person can be restricted based on the review. Use of facial expressions...
vLLM Version: N/A vLLM Build Flags: CUDA Archs: Not Set; ROCm: Disabled; Neuron: Disabled GPU Topology: GPU0 GPU1 GPU2 GPU3 NIC0 NIC1 NIC2 NIC3 NIC4 NIC5 NIC6 NIC7 NIC8 CPU Affinity NUMA Affinity GPU NUMA ID GPU0 X NV12 NV12 NV12 SYS SYS SYS SYS NODE NODE SYS SYS NODE ...
Your current environment Environment in the container: INFO 04-29 06:34:01 [init.py:239] Automatically detected platform cuda. Collecting environment information... PyTorch version: 2.6.0+cu124 Is debug build: False CUDA used to build Py...
The 0 stuck notify wait context info:(context_id=12, notify_id=12).[FUNC:ProcessStarsHcclFftsPlusTimeoutErrorInfo][FILE:device_error_proc.cc][LINE:1392] fftsplus task execute failed, dev_id=0, stream_id=3, task_id=23184, context_id=12, thread_id=0, err_type=13[hccl fftsplus time...