from datasets import load_dataset dataset = load_dataset("squad") for split, split_dataset in dataset.items(): split_dataset.to_json(f"squad-{split}.jsonl") For more information look at the official Huggingface script: https://colab.research.google.com/github/huggingface/notebooks/blob/maste...
Describe the bug load_from_disk and save_to_disk are not compatible. When I use save_to_disk to save a dataset to disk it works perfectly but given the same directory load_from_disk throws an error that it can't find state.json. looks li...
and got satisfying results in inference, but when i try to use SFTTrainer.save_model, and load the model from the saved files using LlamaForCausalLM.from_pretrained, the inference result seem to just be of the not fine-tuned model
AutoModelForCausalLMwill automatically attach adapters in your model if the model path contains aadapter_model.safetensorsoradapter_config.json. Have you put one of these files by mistake in that folder? HI@sd3ntatoAs per the PEFT integration in transformers:https://huggingface.co/docs/peft/t...
My own task or dataset (give details below) Reproduction fromtransformersimportAutoTokenizertok=AutoTokenizer.from_pretrained("mistralai/Mistral-7B-v0.1",add_special_tokens=True)tok.save_pretrained("out") The snippet: works well onadd_special_tokens=being present, absent, True/False on 4.33 an...
❓ Questions & Help I am training Allbert from scratch following the blog post by hugging face. As it mentions that : If your dataset is very large, you can opt to load and tokenize examples on the fly, rather than as a preprocessing step...
My own task or dataset (give details below) Reproduction self.model, self.peft_optimizer, _, self.peft_lr_scheduler = deepspeed.initialize( config=training_args.deepspeed, model=model, model_parameters=optimizers['model_parameters'] if self.training_args.do_train else None, optimizer=hf_optimizer...
"dataset_filename_join_string": " ", "training_image_repeats_per_epoch": 1, "training_write_csv_every": 500, "training_xattention_optimizations": false, "training_enable_tensorboard": false, "training_tensorboard_save_images": false, "training_tensorboard_flush_every": 120, "sd_model_check...