per_device_train_batch_size

2025-01-07 15:09:33

拼音 [ 拼音 ]

When I set `per_device_train_batch_size=2`, the S2-Attn would...

However, when I setper_device_train_batch_size=2, and run the command as follows: CUDA_VISIBLE_DEVICES=1 torchrun --nproc_per_node=1 --master_port=29501 supervised-fine-tune.py \ --model_name_or_path /mnt/42_store/lhj/data/mllm/model_weights/Llama-2-7b-chat-hf \ --bf16 True ...