在image_classification_timm_peft_lora模型微调任务时,训练这一步报错:KeyError: 'per_gpu_train_batch_size',但是在args中两句代码是这样的:per_device_train_batch_size=batch_size,per_device_eval_batch_size=batch_size并没有问题。 Environment / 环境信息 (Mandatory / 必填) -- MindSpore version : 2.3....
'train_micro_batch_size_per_gpu': 'auto', 'wall_clock_breakdown': False} If applicable, add screenshots to help explain your problem. System info (please complete the following information): OS: Ray imagerayproject/ray:nightly-py39-cu118 GPU count and types: 3 Ray 2.5.1 workers each wi...
train_batch_size is not equal to micro_batch_per_gpu * gradient_acc_step * world_size 256 != 4 * 8 * 1 ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 91809) of binary: /home/ubuntu/anaconda3/envs/chat/bin/python when I run ...