packages/flash_attn/flash_attn_interface.py", line 10, in <module> import flash_attn_2_cuda as flash_attn_cuda ImportError: /home/apus/mambaforge/envs/Qwen/lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so: undefined symbol: _ZN3c104cuda9SetDeviceEi #1061...
I found I was unable to import flash_attn_cuda after running python setup.py install. --- details --- I run python setup.py install with a prefix pointing to the root dir of flash-attention. I set PYTHONPATH=$PWD also with the absolute path of the root dir of flash-attention. Any...
import torch.nn.functional as F from torch import nn try: import xformers.ops MEM_EFFICIENT_ATTN = True except ImportError: MEM_EFFICIENT_ATTN = False class AttentionBlock(nn.Module): """ An attention block that allows spatial positions to attend to each other. Originally ported...
import flash_attn_2_cuda as flash_attn_cuda ImportError: /home/mdabdullah-_al-asad/.local/lib/python3.10/site-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol:ZN2at4_ops15sum_IntList_out4callERKNS_6TensorEN3c1016OptionalArrayRefIlEEbSt8optionalINS5_10ScalarType...
add results of the following txt file after piping results to file: pip freeze>out.txtecho$PATH>path.txt and uname -a It seems that there is noflash_attn.flash_attentionmodule after flash-attn2.xversion. Maybe you can try version1.xor0.xversion, such as0.2.8....
xformers version: 0.0.27.post2 Set vram state to: NORMAL_VRAM Device: cuda:0 NVIDIA GeForce RTX 4070 Ti:cudaMallocAsync Using xformers cross attention [Prompt Server] web root: D:\comfyui\ComfyUI\web Adding extra search path checkpoints path/to/stable-diffusion-webui/models/Stable-diffusi...
Reminder I have read the README and searched the existing issues. Reproduction (base) root@I19c2837ff800901ccf:/hy-tmp/LLaMA-Factory-main/src# CUDA_VISIBLE_DEVICES=0,1,2,3 python3.10 api.py \ --model_name_or_path ../model/qwen/Qwen1.5-72...
Probably related to flash attn installation. In my case, the following worked: pip uninstall flash-attn export PATH=/usr/local/cuda/bin:$PATH export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH pip install flash-attn --no-build-isolation --no-cache-dir ...
Thanks for sharing your amazing work, i was excited to give it a try. I tried to follow the steps and after building kernal package in /models/csrc/, after running the code i am getting the error as if there is no package, i am not sure if i am missing anything in between. Should...
replace_llama_attn_with_flash_attn() # allow set token directly10 changes: 10 additions & 0 deletions 10 generate.py Original file line numberDiff line numberDiff line change @@ -0,0 +1,10 @@ import fire from src.gen import main def entrypoint_main(): fire.Fire(main) if __name_...