2 changes: 2 additions & 0 deletions 2 packages/llm/vllm/build.sh Original file line numberDiff line numberDiff line change @@ -29,6 +29,8 @@ git clone --recursive --depth=1 https://github.com/vllm-project/vllm /opt/vllm cd /opt/vllm # apply patches: Remove switching to ...
GIT_REPOSITORY https://github.com/vllm-project/flash-attention.git GIT_TAG 5259c586c403a4e4d8bf69973c159b40cc346fb9 GIT_TAG d886f88165702b3c7e7744502772cd98b06be9e1 GIT_PROGRESS TRUE # Don't share the vllm-flash-attn build between build types BINARY_DIR ${CMAKE_BINARY_DIR}/vllm-flash...
Fast and memory-efficient exact attention. Contribute to Dao-AILab/flash-attention development by creating an account on GitHub.
Actions Projects7 Security Insights Additional navigation options New issue Have a question about this project?Sign up for a free GitHub account to open an issue and contact its maintainers and the community. Sign up for GitHub By clicking “Sign up for GitHub”, you agree to ourterms of ser...
url="https://github.com/vllm-project/flash-attention.git", classifiers=[ "Programming Language :: Python :: 3", "License :: OSI Approved :: BSD License", Expand All@@ -335,14 +288,7 @@ def __init__(self, *args, **kwargs) -> None: ...
A high-throughput and memory-efficient inference and serving engine for LLMs - [Misc] Use vllm-flash-attn instead of flash-attn (#4686) · Alexei-V-Ivanov-AMD/vllm@89579a2
It seems like it supports paged kv cache (https://github.com/Dao-AILab/flash-attention/pull/831/files). IIUC, you can just pass k cache and v cache to k and v forflash_attn_varlen_func? Sorry, something went wrong. Copy link ...
git clone git@github.com:vllm-project/vllm.git cd vllm sudo docker build --target build -t vllm_build . container_id=$(sudo docker create --name vllm_temp vllm_build:latest) sudo docker cp ${container_id}:/workspace/dist . This builds the container up to the build stage, which...
GitHub Copilot Write better code with AI Security Find and fix vulnerabilities Actions Automate any workflow Codespaces Instant dev environments Issues Plan and track work Code Review Manage code changes Discussions Collaborate outside of code Code Search Find more, sea...
cadedaniel closed this as completed Jun 5, 2024 Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment Assignees No one assigned Labels bug Projects None yet Milestone No milestone Development No branches or pull requests 1 participant Foot...