pip install flash_attn 在npu上执行提示报错 我的demo 代码如下:import torch from modelscope import AutoTokenizer, AutoModelForCausalLM, GenerationConfig model_name = "/root/clark/DeepSeek-V2-Chat" tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) max_memory = {i: "7...
And it is not working: (plaid) PS D:\Users\12625\PycharmProjects\plaid-main\plaid-main> pip install -U wheel Requirement already satisfied: wheel in c:\users\12625\anaconda3\envs\plaid\lib\site-packages (0.38.4) Collecting wheel Obtaining dependency information for wheel from https://files...
pip install flash-attn --no-build-isolation fails but pip install flash-attn==1.0.9 --no-build-isolation works Based on this can you say what I might to try to fix the error? torch.__version__ = 2.0.1+cu117 fatal: not a git repository (o...
3. 注意README已经告诉你了,需要提前安装ninja,否则编译过程会持续很长时间,如果你的ninja已经安装完毕,可以直接执行pip install flash-attn --no-build-isolation 但实测直接pip的话编译过程会超级慢,强烈建议从源码直接进行编译(需提前安装好ninja): git clonehttps://github.com/Dao-AILab/flash-attention.git ...
使用int量化模型: `import os import torch from transformers import AutoTokenizer, GenerationConfig from quant.modeling_telechat_gptq import TelechatGPTQForCausalLM os.environ["CUDA_VISIBLE_DEVICES"] = '0' model_path="/home/zhanghui/models/Tele-...
windows下部署,前面的问题卡按issues中类似情况的解决方案都解决了,到了pip install flash-attn --no-build-isolation这一步实在过不去了,折腾了很久都不行,请问有遇到类似问题的吗 Owner Ucas-HaoranWei commented Sep 21, 2024 可以不用flash attention liujie-t commented Sep 26, 2024 你好,同在windows部...
Describe the issue Issue: I had errors when run the command, "pip install flash-attn --no-build-isolation" It seems that because I don't have cuda, I am only using the M1 max chip of mac book pro with 64GB of ram. Command: pip install fl...
pip install -e . pip install -e ".[train]" pip install flash-attn --no-build-isolation --no-cache-dir Command: bash my_python_finetune_lora.sh env: # Name Version Build Channel _libgcc_mutex 0.1 main https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main ...