Gemma2 need torch>=2.4.0 asthis mentioned Because when I run it I get this error: File "/usr/local/lib/python3.10/dist-packages/transformers/cache_utils.py", line 1656, in __init__ torch._dynamo.mark_static_address(new_layer_key_cache) AttributeError: module 'torch._dynamo' has no ...
i met the same problem.At first i thought it was because i hadn't run setup.py but had directly downloaded the github file.Although i ran it for a long time,i still couldn't find it. Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment...
最好方法是在https://github.com/Dao-AILab/flash-attention/releases找到自己环境对应的wheel 根据我的cuda117 torch2.0 python3.9.8 找到最新版的wheel 我先安装了flash_attn-2.3.2+cu117torch2.0cxx11abiTRUE-cp39-cp39-linux_x86_64.whl依旧import错误,最后安装了flash_attn-2.3.5+cu117torch2.0cxx11abiFAL...
然后,在这个新环境中测试是否能够正常使用flashattention-2后端。 如果以上步骤仍然无法解决问题,建议查看flash_attn的GitHub仓库的issues页面,看看是否有其他用户遇到并解决了类似的问题,或者你可以在那里提出一个新的问题寻求帮助。
安装以下whl: https://github.com/Dao-AILab/flash-attention/releases/download/v2.5.6/flash_attn-2.5.6+cu122torch2.2cxx11abiFALSE-cp310-cp310-linux_x86_64.whl; 报错: RuntimeError: Failed to import transformers.models.llama.modeling_llama because of the following error (look up to see its ...
It is not possible to script flash_attn_2_cuda.varlen_fwd with torch.jit.script. Error message: RuntimeError: Python builtin <built-in method varlen_fwd of PyCapsule object at 0x7806d86a63a0> is currently not supported in Torchscript: Ha...
zhangjinnan self-assigned thison Feb 25, 2025 zhangjinnanclosed this as completedon Mar 14, 2025 Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment Metadata Assignees zhangjinnan LabelsNo labels TypeNo type ProjectsNo projects MilestoneNo milesto...
gaoshangleopened this issueJan 22, 2025· 0 comments gaoshanglecommentedJan 22, 2025 gaoshanglechanged the title使用vllm框架,怎么配置--flash-attn2Jan 24, 2025 gaoshangleclosed this ascompletedJan 24, 2025 Sign up for freeto join this conversation on GitHub. Already have an account?Sign in...
右键,复制链接, 在linux中使用wget + 链接进行whl安装包的下载: wget https://github.com/Dao-AILab/flash-attention/releases/download/v2.6.3/flash_attn-2.6.3+cu123torch2.3cxx11abiFALSE-cp310-cp310-linux_x86_64.whl 最后使用pip install whl路径,下载好flash-attn,大功告成!
解决方法 方法一 从官方release种找到对应cuda版本和torch版本的whl文件,并下载 在本地使用pip3 install ${whl}的方式安装 方法二 从源码直接编译,详见官方github 作者:Garfield2005