默认情况下,我们的配置使用 LoRA 微调的 8B 教师模型、下载的 1B 学生模型、3e-4 的学习率和 0.5 的 KD 损失比率。对于这个案例研究,我们在 alpaca_cleaned_dataset(https://pytorch.org/torchtune/main/generated/torchtune.datasets.alpaca_cleaned_dataset.html#torchtune.datasets.alpaca_cleaned_dataset)上进行...
数据集采用alpaca, 可以设置seed, batch_size=2, https://github.com/pytorch/torchtune/blob/release/0.5.0/torchtune/datasets/_alpaca.py https://huggingface.co/datasets/yahma/alpaca-cleaned # Dataset and Sampler dataset: _component_: torchtune.datasets.alpaca_cleaned_dataset packed: False # True ...
datasets._alpaca import alpaca_cleaned_dataset, alpaca_dataset from torchtune.datasets._chat import chat_dataset, ChatDataset from torchtune.datasets._cnn_dailymail import cnn_dailymail_articles_dataset @@ -22,8 +23,6 @@ TextCompletionDataset, ) from torchtune.datasets._wikitext import wikitext...
Split alpaca_dataset to alpaca + alpaca_cleaned (pytorch#639) Apr 3, 2024 tests Add string to InstructTemplate, ChatFormat getters (pytorch#641) Apr 4, 2024 torchtune Add string to InstructTemplate, ChatFormat getters (pytorch#641) Apr 4, 2024 .flake8 Add LoRA fused linear layer (pytorch...
False # Dataset and Sampler dataset: _component_: torchtune.datasets.alpaca_cleaned_dataset ...
Cancel Submit feedback Saved searches Use saved searches to filter your results more quickly Cancel Create saved search Sign in Sign up Reseting focus {{ message }} pytorch / torchtune Public Notifications You must be signed in to change notification settings Fork 470 Star 4.5k ...
False # Dataset and Sampler dataset: _component_: torchtune.datasets.alpaca_cleaned_dataset ...
datasets/ contains training datasets. alpaca_data_cleaned.json contains text that is fed to the model for updating the parameters. The dataset is licensed under datasets/LICENSE, while the remaining code in this repository falls under ./LICENSE.About...
微调大型语言模型(LLM)对于让预训练模型适配特定任务至关重要,但这一过程可能复杂且资源消耗大。Torch...
"dataset:\n", " _component_: torchtune.datasets.alpaca_dataset\n", " train_on_input: True\n", "seed: null\n", "shuffle: True\n", "\n", "# Model Arguments\n", "model:\n", " _component_: torchtune.models.mistral.lora_mistral_7b\n", " lora_attn_modules: ['q_proj', 'k...