物理机上的npu没有分配ip时,此时/etc/hccn.conf文件为空,读取/etc/hccn.conf为空,hccl_tools.py报错,需要为npu分配ip。 你希望看到什么解决方案? 在hccl_tools.py的readme中进行补充说明,或者在hccl_tools.py中自动为机器分配IP 你考虑过哪些替代方案? 你有其他上下文或截图吗? 意向参与贡献 我有意向参与具体...
hccl_tools.py脚本生成hccl_8p.json文件失败 DONE #I3DP2Y Bug-Report xixi_han 创建于 2021-03-26 20:33 nameaboutlabels Bug Report Use this template for reporting a bug kind/bug Environment Hardware Environment(Ascend/GPU/CPU): Uncomment only one /device <> line, hit enter to put that in...
用途,准备用这两张推理卡部署chatglm3,没有完整的教程,自己摸索,在基础组件安装完成后,准备生成hccl json文件的时候,执行 (ascend_py39) [root@xctest1 mindformers]# python ./mindformers/tools/hccl_tools.py --device_num "[0,8)" --server_ip=10.23.13.83 start /root/llm/mind/mindformers/./mi...
由于hccl头文件目录结构修改,导致mmcv npu侧编译报错,增加头文件(1.x分支). Modification setup.py中增加编译需要的头文件. BC-breaking (Optional) Does the modification introduce changes that break the backward-compatibility of the downstream repositories? If so, please describe how it breaks the compatibil...
由于hccl头文件目录结构修改,导致mmcv npu侧编译报错,增加头文件 Modification setup.py中增加编译需要的头文件. BC-breaking (Optional) Does the modification introduce changes that break the backward-compatibility of the downstream repositories? If so, please describe how it breaks the compatibility and how...
python hccl_tools.py --device_num "[0,8)" output: hccl_[device_num]p_[which device]_[server_ip].json Note Please note that the Ascend accelerators used must be continuous, such [0,4) means to use four chips 0,1,2,3; [0,1) means to use chip 0; The first four chips are...
feature-master-memory-py parallel_split_micro_interleaved_master feature-br_base-tools master-kbk-infer-opt-dev master-kbk-infer-opt feature-2.3-ccool feature-pynative-perf feature-master-comm-ops feature-master-graph-parallel feature-2.3-comm-ops feature-2.3-tools feature-2.3-enhance-dvpp feature...