This is the first Chinese chat model specifically fine-tuned for Chinese through ORPO based on the Meta-Llama-3-8B-Instruct model. - Shenzhi-Wang/Llama3-Chinese-Chat
This is the first Chinese chat model specifically fine-tuned for Chinese through ORPO based on the Meta-Llama-3-8B-Instruct model. - Llama3-Chinese-Chat/README.md at main · Shenzhi-Wang/Llama3-Chinese-Chat
shenzhi-wang/Llama3-70B-Chinese-Chat · Hugging Face 自动总结: - Llama3-70B-Chinese-Chat是一个针对中文和英文用户的LLM模型,具有多种能力。 - Llama3-70B-Chinese-Chat在中文表现上超过了ChatGPT,与GPT-4相媲美。 - Llama3-70B-Chinese-Chat的训练使用了ORPO算法和大量中英文数据集。 - Llama3-70B-...
Shenzhi WangShenzhi-Wang Block or Report PinnedLoading Llama3-Chinese-ChatLlama3-Chinese-ChatPublic This is the first Chinese chat model specifically fine-tuned for Chinese through ORPO based on the Meta-Llama-3-8B-Instruct model. 30717
--output_dir /nfsdata/gpu007/dingfei/commondata/huggingface/Meta-Llama-3-8B-Instruct-saves/shenzhi-wang/Llama3-8B-Chinese-Chat-v1-test1 --per_device_train_batch_size 2 --per_device_eval_batch_size 2 --gradient_accumulation_steps 4 --lr_scheduler_type cosine --log_level info ...
你好作者,我在使用djl环境下运行了Llama3-8B-Chinese-Chat-q4_0-v2_1.gguf模型,推理的结果部分单词会发生乱码, 我通过debug代码发现一个UTF8的字节数组拆成两个字节数组,正常'奋'的UFT8字节编码为[-27, -91, -117]。
在我的测试中,您所用的prompt应该就不会出现夹带英文的情况了。 👍 1 Owner Shenzhi-Wang commented May 6, 2024 可以尝试一下我们的v2.1的版本,这个问题减少了很多。 https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat/commit/4788ab8512511daa7b80f75c85ceb703661a4a4cSign...
cd LLaMA-Factory deepspeed --num_gpus 8 src/train_bash.py --deepspeed ${Your_Deepspeed_Config_Path} \ Author shake commented Apr 24, 2024 seem is my problem. Sorry, something went wrong. shake closed this as completed Apr 24, 2024 Sign up for free to join this conversation on Gi...
This is the first Chinese chat model specifically fine-tuned for Chinese through ORPO based on the Meta-Llama-3-8B-Instruct model. - 有gradio那种网页版的部署方式吗? · Issue #4 · Shenzhi-Wang/Llama3-Chinese-Chat
This is the first Chinese chat model specifically fine-tuned for Chinese through ORPO based on the Meta-Llama-3-8B-Instruct model. - context_length是微调成2k的了吗 · Issue #3 · Shenzhi-Wang/Llama3-Chinese-Chat