若要启用这些技巧,请将config.json里的use_dynamc_ntk和use_logn_attn设置为true。最新代码默认为true。 微调 当前是否支持SFT和RLHF? 我们目前提供了SFT的代码,支持全参数微调、LoRA和Q-LoRA。此外,当前有多个外部项目也已实现支持,如FastChat、Firefly、LLaMA Efficient Tuning等。我们会尽快更新这部分代码和说明...
+ dynamic_ntk + logn + window_attn - 3.46 3.29 3.18 3.42 - Qwen-72B - - - 2.83 2.73 2.72 Furthermore, to verify the ability of Qwen-72B-Chat on long text understanding, we tested it on L-Eval (closed-ended tasks). The results are as follows: ModelInput LengthAverageCourseraGSMQuALIT...
use_dynamc_ntk and use_logn_attn in config.json should be set to true (true by default). Finetuning Can Qwen support SFT or even RLHF? Yes, we now support SFT, including full-parameter finetuning, LoRA, and Q-LoRA. Also you can check other projects like FastChat, Firefly, LLaMA ...
+ dynamic_ntk + logn + window_attn - 3.46 3.29 3.18 3.42 - Qwen-72B - - - 2.83 2.73 2.72 Furthermore, to verify the ability of Qwen-72B-Chat on long text understanding, we tested it on L-Eval (closed-ended tasks). The results are as follows: ModelInput LengthAverageCourseraGSMQuALIT...
+ dynamic_ntk + logn + window_attn - 3.46 3.29 3.18 3.42 - Qwen-72B - - - 2.83 2.73 2.72 Furthermore, to verify the ability of Qwen-72B-Chat on long text understanding, we tested it on L-Eval (closed-ended tasks). The results are as follows: ModelInput LengthAverageCourseraGSMQuALIT...