qwen-7b-qanything

2025-04-12 04:36:25

拼音 [ 拼音 ]

Qwen-7B-QAnything能否提供4bit版本 · Issue #126 · netease...

可以启动的,但hf推理效率较低,如果bash ./run.sh -c local -i 0 -b vllm -m Qwen-7B-QAnything -t qwen-7b-qanything 这样在一张24g的卡上显存是不够的,因为vllm不支持8bit。所以希望手动转成4bit,但目前提供的版本无法转换。 AprildreamMI commented Feb 20, 2024 可以启动的,但hf推理效率较低...
.../run.sh -c local -i 0 -b hf -m Qwen-7B-QAnything -t qwen...

qanything-container-local | model_name is set to [Qwen-7B-QAnything] qanything-container-local | conv_template is set to [qwen-7b-qanything] qanything-container-local | tensor_parallel is set to [1] qanything-container-local | gpu_memory_utilization is set to [0.81] qanything-container-...
网易有道QAnything调用fastchat-Qwen7B - 知乎

这个项目里网易开源了自己训练的Qwen模型,通过fastchat进行服务的启动。具体fastchat文档可以参考FastChat openai调用在run_for_local_option.sh中(186行),有如下启动命令 mkdir -p /workspace/qanything_local/logs/debug_logs/fastchat_logs && cd /workspace/qanything_local/logs/debug_logs/fastchat_logs noh...