do+you+want+to+use+deepspeed

2024-12-22 10:34:04

拼音 [ 拼音 ]

GitHub - Nan-Do/axolotl: Go ahead and axolotl questions

eval_sample_packing: # you can set these packing optimizations AFTER starting a training at least once. # The trainer will provide recommended values for these values. sample_packing_eff_est: total_num_tokens: # if you want to use 'lora' or 'qlora' or leave blank to train all parameters...
GitHub - Dophi123/LLaVA: [NeurIPS'23 Oral] Visual Instruction...

Training script with DeepSpeed ZeRO-3: finetune.sh. If you are do not have enough GPU memory: Use LoRA: finetune_lora.sh. We are able to fit 13B training in 8-A100-40G/8-A6000, and 7B training in 8-RTX3090. Make sure per_device_train_batch_size*gradient_accumulation_steps is the...