Sign in Sign up ggml-org / llama.cpp Public Notifications Fork 11.3k Star 77.8k Code Issues 330 Pull requests 406 Discussions Actions Projects 9 Wiki Security 5 Insights CI ci: fix issue in android buil
ggml-org / llama.cpp Public Notifications Fork 11.3k Star 77.8k Code Issues 330 Pull requests 406 Discussions Actions Projects 9 Wiki Security 5 Insights Pull Request Labeler ci: fix issue in android build(https://github.com/ggml-org/llama.cpp/issues/12638) ...
llama-cli --version ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no ggml_cuda_init: found 1 CUDA devices: Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes version: 739 (8b9cc7c) built with cc (Debian 12.2.0-14) 12.2.0 ...
ggml : move AMX to the CPU backend (#10570) 6个月前 prompts llama : add Qwen support (#4281) 2年前 requirements mtmd : rename llava directory to mtmd (#13311) 1个月前 scripts sync : ggml 30天前 src vocab : add ByteDance-Seed/Seed-Coder (#13423) ...
llama.cpp requires the model to be stored in the GGUF file format. Models in other data formats can be converted to GGUF using the convert_*.py Python scripts in this repo.The Hugging Face platform provides a variety of online tools for converting, quantizing and hosting models with llama...
ggml-org/llama.cpp - LLM inference in C/C++ Tencent/tgfx - A lightweight 2D graphics library for rendering texts, geometries, and images with high-performance APIs that work across various platforms. v8/v8 - The official mirror of the V8 Git repository ocornut/imgui - Dear ImGui: Bloat...
gitclonehttps://github.com/kvcache-ai/ktransformers.gitcdktransformers git submodule init git submodule update [可选] 如果您想运行网站,请在执行bash install.sh之前, 进行compile the website 编译并安装(适用于 Linux) bash install.sh 编译并安装(适用于 Windows) ...
Variety of models supported (LLaMa2, Mistral, Falcon, Vicuna, WizardLM. With AutoGPTQ, 4-bit/8-bit, LORA, etc.) GPU support from HF and LLaMa.cpp GGML models, and CPU support using HF, LLaMa.cpp, and GPT4ALL models Attention Sinks for arbitrarily long generation (LLaMa-2, Mistral...
Llama.cpp (via llama-cpp-python) ✔️ gguf, ggml All models supported by llama.cpp generate_until, loglikelihood, (perplexity evaluation not yet implemented) vLLM ✔️ vllm Most HF Causal Language Models generate_until, loglikelihood, loglikelihood_rolling Mamba ✔️ mamba_ssm Mamba...
Submodule llama.cpp updated 10 files +10 −1 .gitignore +27 −1 README.md +117 −0 examples/nexa-omni-audio/omni.cpp +12 −1 examples/nexa-omni-audio/omni.h +115 −0 examples/qwen2-audio/qwen2.cpp +12 −1 examples/qwen2-audio/qwen2.h +15 −0 ggml/src/...