from datasets import load_dataset dataset = load_dataset("squad") for split, split_dataset in dataset.items(): split_dataset.to_json(f"squad-{split}.jsonl") For more information look at the official Huggingface script: https://colab.research.google.com/github/huggingface/notebooks/blob/maste...
model = AutoModelForCausalLMWithValueHead.from_pretrained("./model_after_rl_comb_reward") ppo_trainer = PPOTrainer(config, model, ref_model, tokenizer, dataset=dataset, data_collator=collator) then test: same performance as with no RL (bad) Second condition: backup the model and redefine the...
(cfg=parsed_cfg, cli_args=parsed_cli_args, dataset_meta=dataset_meta) File "/home/owen/axolotl/src/axolotl/train.py", line 142, in train trainer.train(resume_from_checkpoint=resume_from_checkpoint) File "/home/owen/miniconda3/envs/axolotl/lib/python3.10/site-packages/transformers/trainer.py...
As per the PEFT integration in transformers:https://huggingface.co/docs/peft/tutorial/peft_integrations#transformersAutoModelForCausalLMwill automatically attach adapters in your model if the model path contains aadapter_model.safetensorsoradapter_config.json. Have you put one of these files by mista...
My own task or dataset (give details below) Reproduction fromtransformersimportAutoTokenizertok=AutoTokenizer.from_pretrained("mistralai/Mistral-7B-v0.1",add_special_tokens=True)tok.save_pretrained("out") The snippet: works well onadd_special_tokens=being present, absent, True/False on 4.33 and ...
My own task or dataset (give details below) Reproduction self.model, self.peft_optimizer, _, self.peft_lr_scheduler = deepspeed.initialize( config=training_args.deepspeed, model=model, model_parameters=optimizers['model_parameters'] if self.training_args.do_train else None, optimizer=hf_optimizer...
Purpose: enable saving and loading transformers models in 4bit formats. Enables this PR in transformers: huggingface/transformers#26037 addresses feature request #603 and other similar ones elsewhe...
Support saving models trained with DeepSpeed in Trainer callbackshuggingface/transformers#31338 Open ferrazzipietromentioned this issueJun 18, 2024 [BUG] AttributeError: 'Accelerator' object has no attribute 'deepspeed_config'microsoft/DeepSpeed#4143 ...