解决方法: 修改代码,添加量化:load_in_8bit=True或者load_in_4bit=True model = LlamaForCausalLM.from_pretrained("checkpoint/ziya-13b-qlora-sft-merge", torch_dtype=torch.float16, load_in_8bit=True, device_map='auto')发布于 2023-07-28 16:27・湖北 ...