globalcdbconfigure(deepspeed_config=config)#2、检查cdb状态,如果为空或者未被初始话,则告诉后续程序需要初始化通信后端dist_init_required is trueifdist_init_requiredisNone:dist_init_required=cdbisNoneornotcdb.is_initialized()#3、如果cdb为空,设置通信后端NCCL/gloo/MPI等ifcdbisNone:init_deepspeed_backend...
大棒居-杨大棒创建的收藏夹深度学习内容:DeepSpeed和Megatron如何调用NCCL源码解读,通信后端初始化init_distributed(),如果您对当前收藏夹内容感兴趣点击“收藏”可转入个人收藏夹方便浏览
deepspeed_train.py pipeline_parallelism train.py 2 changes: 1 addition & 1 deletion2BingBertSquad/nvidia_run_squad_deepspeed.py Original file line numberDiff line numberDiff line change Expand Up@@ -741,7 +741,7 @@ def set_optimizer_params_grad(named_params_optimizer, ...